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Preface 



This volume contains the proceedings of MFC 2000, the fifth international con- 
ference on Mathematics of Program Construction. This series of conferences 
aims to promote the development of mathematical principles and techniques 
that are demonstrably useful and usable in the process of constructing com- 
puter programs (whether implemented in hardware or software). The focus is 
on techniques that combine precision with concision, enabling programs to be 
constructed by formal calculation. Within this theme, the scope of the series 
is very diverse, including programming methodology, program specification and 
transformation, programming paradigms, programming calculi, and program- 
ming language semantics. 

The quality of the papers submitted to the conference was in general very 
high. However, the number of submissions has decreased compared to the previ- 
ous conferences in the series. Each paper was refereed by at least five and often 
more committee members. In order to maintain the high standards of the con- 
ference the committee took a stringent view on quality; this has meant that, in 
some cases, a paper was rejected even though there was a basis for a good con- 
ference or journal paper but the submitted paper did not meet the committee’s 
required standards. In a few cases a good paper was rejected on the grounds 
that it did not fit within the scope of the conference. 

In addition to the 12 papers selected for presentation by the program com- 
mittee, this volume contains the extended abstracts of three invited talks: Inte- 
grating Programming, Properties, and Validation, by Mark Jones (Oregon Grad- 
uate Institute, USA); Regular Expressions Revisited: a Coinductive Approach to 
Streams, Automata, and Power Series, by Jan Rutten (CWI, The Netherlands), 
and Formal Methods and Dependability, by Cliff Jones (University of Newcastle, 
UK). 

The conference took place in Ponte de Lima, Portugal and was organized 
by the Informatics Department of Minho University, Braga, Portugal. The pre- 
vious four conferences were held in 1989 at Twente, The Netherlands, in 1992 
at Oxford, United Kingdom, in 1995 at Kloster Irsee, Germany, and in 1998 at 
Marstrand near Goteborg in Sweden. The proceedings of these conferences were 
published as LNCS 375, 669, 947, and 1422, respectively. 

Four international events were co-located with the conference: the second 
workshop on Constructive Methods for Parallel Programming, the second work- 
shop on Generic Programming, the workshop on Subtyping and Dependent Types 
in Programming, and the third workshop on Attribute Grammars and their Ap- 
plications. We thank the organizers of these events for their interest in sharing 
the atmosphere of the conference. 
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Integrating Programming, Properties, and 

Validation 



Mark P. Jones 

Department of Computer Science and Engineering 
Oregon Graduate Institute of Science and Technology 
Beaverton, Oregon, USA 
mp j @cse . ogi . edu 

Current program development environments provide excellent support for 
many desirable aspects of modern software applications such as performance and 
interoperability, but almost no support for features that could directly enhance 
correctness and reliability. In this talk, I will describe the first steps that we are 
making in a project to develop a new kind of program development environment. 
Our goal is to produce a tool that actively supports and encourages its users in 
thinking about, stating, and validating key properties of software as an integral 
part of the programming process. 

The environment that we are designing will allow programmers to assert 
properties of program elements as part of their source code, capturing intuitions 
and insights about its behavior at the time it is written. These property asser- 
tions will also provide an opportunity to give more precise interfaces to software 
components and libraries. Even by themselves, assertions can provide valuable 
documentation, and can be type checked to ensure a base level of consistency 
with executable portions of the program. Critically, however, our environment 
will allow property assertions to be annotated with “certificates” that provide 
evidence of validity. By adopting a generic interface, many different forms of 
certificate will be supported, offering a wide range of validation options — from 
low-cost instrumentation and automated testing, to machine-assisted proof and 
formal methods. Individual properties and certificates may pass through sev- 
eral points on this spectrum as development progresses, and as higher levels 
of assurance are required. To complete the environment, a suite of “property 
management” tools will provide users with facilities to browse or report on the 
status of properties and associated certificates within a program, and to explore 
different validation strategies. 

We plan to evaluate our system by applying it to some real-world, security 
related problems, and we hope to demonstrate that it can contribute to the de- 
velopment of more robust and dependable software applications. For example, 
a tighter integration between programming and validation should provide the 
mechanisms that we need to move towards a model for accountability in soft- 
ware, and away from the ubiquitous “as is” warranties that we see today. More 
specifically, this could allow vendors to provide and sell software on the strength 
of firm guarantees about its behavior, and allow users to identify the responsible 
parties when a software system fails. 
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Abstract. A polytypic value is one that is defined by induction on the 
structure of types. In Haskell types are assigned so-called kinds that 
distinguish between manifest types like the type of integers and functions 
on types like the list type constructor. Previous approaches to polytypic 
programming were restricted in that they only allowed to parameterize 
values by types of one fixed kind. In this paper we show how to define 
values that are indexed by types of arbitrary kinds. It turns out that these 
polytypic values possess types that are indexed by kinds. We present 
several examples that demonstrate that the additional flexibility is useful 
in practice. One paradigmatic example is the mapping function, which 
describes the functorial action on arrows. A single polytypic definition 
yields mapping functions for datatypes of arbitrary kinds including first- 
and higher-order functors. Polytypic values enjoy polytypic properties. 
Using kind-indexed logical relations we prove among other things that 
the polytypic mapping function satisfies suitable generalizations of the 
functorial laws. 



1 Introduction 

It is widely accepted that type systems are indispensable for building large and 
reliable software systems. Types provide machine checkable documentation and 
are often helpful in finding programming errors at an early stage. Polymor- 
phism complements type security by flexibility. Polymorphic type systems like 
the Hindley-Milner system [21] allow the definition of functions that behave uni- 
formly over all types. However, even polymorphic type systems are sometimes 
less flexible that one would wish. For instance, it is not possible to define a 
polymorphic equality function that works for all types — the parametricity the- 
orem [29] implies that a function of type Va.a a Bool must necessarily 
be constant. As a consequence, the programmer is forced to program a separate 
equality function for each type from scratch. 

Polytypic programming [3,2] addresses this problem. Actually, equality serves 
as a standard example of a polytypic function that can be defined by induction 
on the structure of types. In a previous paper [8] the author has shown that 
polytypic functions are uniquely defined by giving cases for primitive types, the 
unit type, sums, and products. Given this information a tailor-made equality 
function can be automatically generated for each user-defined type. 



R. Backhouse and J. N. Oliveira (Eds.): MPC 2000, LNCS 1837, pp. 2-27, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 



Polytypic Values Possess Polykinded Types 3 

Another useful polytypic function is the so-called mapping function. The 
mapping function of a unary type constructor F applies a given function to each 
element of type a in a given structure of type F a — we tacitly assume that F 
does not include function types. Unlike equality the mapping function is indexed 
by a type constructor, that is, by a function on types. Now, mapping functions 
can be defined for type constructors of arbitrary arity. In the general case the 
mapping function takes n functions and applies the Uth function to each element 
of type ai in a given structure of type F a\ ... . Alas, current approaches to 

polytypic programming [11,8] do not allow to define these mapping functions at 
one stroke. The reason is simply that the mapping functions have different types 
for different arities. 

This observation suggests a natural extension of polytypic programming: it 
should be possible to assign a type to a polytypic value that depends on the arity 
of the type-index. Actually, we are more ambitious in that we consider not only 
first-order but also higher-order type constructors. A type constructor is said 
to be higher-order if it operates on type constructors rather than on types. To 
distinguish between types, first-order and higher-order type constructors, they 
are often assigned so-called kinds [17], which can be seen as the ‘types of types’. 
Using the notion of kind we can state the central idea of this paper as follows: 
polytypic values possess types that are defined by induction on the structure of 
kinds. It turns out that the implementation of this idea is much simpler than 
one would expect. 

The rest of this paper is organized as follows. Section 2 illustrates the ap- 
proach using the example of mapping functions. Section 3 introduces the lan- 
guage of kinds and types, which is based on the simply typed lambda calcu- 
lus. Section 4 explains how to define polytypic values and polykinded types 
and Section 5 shows how to specialize a polytypic value to concrete instances 
of datatypes. Section 6 presents several examples of polytypic functions with 
polykinded types, which demonstrate that the extension is useful in practice. 
Polytypic values enjoy polytypic properties. Section 7 shows how to express poly- 
typic laws using kind-indexed logical relations. Among other things, we show that 
the polytypic mapping function satisfies suitable generalizations of the functorial 
laws. Finally, Section 8 reviews related work and Section 9 concludes. 

2 A Worked-Out Example: Mapping Functions 

This section illustrates the central idea by means of a worked-out example: 
mapping functions. For concreteness, the code will be given in the functional 
programming language Haskell 98 [25]. Throughout, we use Haskell as an abbre- 
viation for Haskell 98. Before tackling the polytypic mapping function let us first 
take a look at different datatypes and associated monotypic mapping functions. 
As an aside, note that the combination of a type constructor and its mapping 
function is often referred to as a functor. 

As a first, rather simple example consider the list datatype. 



data List a = Nil \ Cons a {List a) 



4 



Ralf Hinze 



Actually, List is not a type but a unary type constructor. In Haskell the ‘type’ 
of a type constructor is specified by the kind system. For instance, List has kind 
■k —t -k. The kind represents manifest types like Lnt or Bool. The kind ki — > K2 
represents type constructors that map type constructors of kind ki to those of 
kind K2- The mapping function for List is given by 

mapList :: Vai 02. (ai — > 02) — > {List a\ — > List 02) 

mapList mapa Nil = Nil 

mapList mapa (Cons v vs) = Cons (mapa v) (mapList mapa vs) . 

The mapping function takes a function and applies it to each element of a given 
list. It is perhaps unusual to call the argument function mapa. The reason for 
this choice will become clear as we go along. For the moment it suffices to bear in 
mind that the definition of mapList rigidly follows the structure of the datatype. 

The List type constructor is an example of a so-called regular or uniform 
type. Briefly, a regular type is one that can be defined as the least fixpoint of a 
functor. Interestingly, Haskell’s type system is expressive enough to rephrase List 
using an explicit fixpoint operator [ 19 ]. We will repeat this construction in the 
following as it provides us with interesting examples of datatypes and associated 
mapping functions. First, we define the so-called base or pattern functor of List. 

data ListF a b = NilF \ ConsF a b 

The type ListF has kind * — > (* ^ *) , which shows that binary type constructors 
are curried in Haskell. The following definition introduces a fixpoint operator on 
the type level (newtype is a variant of data introducing a new type that is 
isomorphic to the type on the right-hand side). 

newtype Fix f = Ln (f (Fix /)) 

The kind of Fix is ^ a so-called second-order kind. The order of a kind 

is given by order(-k) = 0 and order(ni K2) = max{l + order ( ki), order ( k2)}- 
It remains to define List as a fixpoint of its base functor (type defines a type 
synonym) . 

type List' a = Fix (ListF a) 

Now, how can we define the mapping function for lists thus defined? For a start, 
we define the mapping function for the base functor. 

mapListF :: Voi 02.(01 — > 02) ^ V&i &2-(&i ^ ^2) 

— > (ListF oi bi — > ListF 02 62) 

mapListF mapa mapb NilF = NilF 
mapListF mapa mapb (ConsF v w) 

= ConsF (mapa v) (mapb w) 

Since the base functor has two type arguments, its mapping function takes two 
functions, mapa and mapb, and applies them to values of type oi and &i, respec- 
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tively. Even more interesting is the mapping function for Fix 

mapFix :: V/i /2.(Vai a2-(ai ^ ^2) ^ (A ^ A 02)) 

^ {Fix A ^ Fix A) 

mapFix mapf {In v) = In {mapf {mapFix mapf) v) , 

which takes a polymorphic function as argument. In other words, mapFix has 
a so-called rank-2 type signature [16]. Though not in the current language def- 
inition, rank-2 type signatures are supported by recent versions of the Glasgow 
Haskell Compiler GHC [28] and the Haskell interpreter Hugs [15]. The argument 
function, mapf, has a more general type than one would probably expect: it 
takes a function of type a\ ^ 02 to a function of type A «i l2 «2- By contrast, 
the mapping function for List (which also has kind * — > *) takes ai ^ 02 to 
List a\ — *■ List a^. The definition below demonstrates that the extra generality 
is vital. 



mapLisF :: Vai 02.(01 — > 02) — *■ {List' oi — > List' 02) 

mapList' mapa = mapFix {mapListF mapa) 

The argument of mapFix has type V&i A-(A A) ^ {ListF aib\ ListFa^h^), 

that is, A is instantiated to ListF oi and A to ListF 02. 

The list datatype is commonly used to represent sequences of elements. An 
alternative data structure, which supports logarithmic access, is C. Okasaki’s 
type of binary random-access lists [24] . 



data Fork a = Fork a a 

data Sequ a = Empty \ Zero {Sequ {Fork a)) j One a {Sequ {Fork a)) 

Since the type argument is changed in the recursive calls, Sequ is an example 
of a so-called nested or non-regular datatype [4]. Though the type recursion is 
nested, the definition of the mapping function is entirely straightforward. 



mapFork 

mapFork mapa {Fork vi V2) 
mapSequ 

mapSequ mapa Empty 
mapSequ mapa {Zero vs) 
mapSequ mapa {One v vs) 



Voi 02.(oi — !■ 02) — > {Eork oi — s- Eork 02) 

Fork {mapa vi) {mapa V2) 

Voi 02.(oi — !■ 02) — > {Sequ Oi ^ Sequ 02) 
Empty 

Zero {mapSequ {mapFork mapa) vs) 

One {mapa v) {mapSequ {mapFork mapa) vs) 



Note that mapSequ requires polymorphic recursion [23] : the recursive calls have 
type Voi a2-{Fork oi — > Fork 02) ^ {Sequ {Fork oi) ^ Sequ {Fork 02)), 
which is a substitution instance of the declared type. Haskell allows polymorphic 
recursion only if an explicit type signature is provided. 

Since Sequ is a nested type, it cannot be expressed as a fixpoint of a functor. 
However, it can be rephrased as a fixpoint of a higher-order functor [4]. Again, 
we will carry out the construction to generate examples of higher-order kinded 
datatypes. The higher-order base functor associated with Sequ is 



data SequF s a = EmptyF \ ZeroF {s {Fork a)) j OneF a {s {Fork a)) . 
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Since Sequ has kind its base functor has kind (* ^ *) ^ (* ^ *). The 

fixpoint operator for functors of this kind is given by 

newtype HFix h a = HIn {h {HFix h) a) . 

Since the fixpoint operator takes a second-order kinded type as argument, it has 
a third-order kind: ((* ^ ^ > *)) — > (* ^ *). Finally, we can define Sequ 

as the least fixpoint of SequF . 

type Sequ' = FlFix SequF 

As a last stress test let us define a mapping function for Sequ'. As before we 
begin by defining mapping functions for the component types. 

mapSequF :: Vsi S2-(V&i &2-(&i — > h) — > (si h — > S2 62)) 

^ Vfli a2-(ai ^ ^2) ^ {SequF si a\ SequF S2 02) 
mapSequF maps mapa EmptyF 
= EmptyF 

mapSequF maps mapa {ZeroF vs) 

= ZeroF (maps (map Fork mapa) vs) 
mapSequF maps mapa (OneF v vs) 

= OneF (mapa v) (maps (mapFork mapa) vs) 

This example indicates why argument maps of kind * — > * must be polymorphic: 
both calls of maps are instances of the declared type. In general, the argument 
mapping function may be applied to many different types. Admittedly, the type 
signature of mapSequF looks quite puzzling. However, we will see in a moment 
that it is fully determined by SequF’s kind. Even more daunting is the signature 
of mapFlFix , which has rank 3. Unfortunately, no current Haskell implementation 
supports rank-3 type signatures. Hence, the following code cannot be executed. 

mapHFix :: V/ii /i2.(V/i /2.(Vci C2.(ci ^ C2) ^ (/i ci ^ /2 C2)) 

^ V61 62. (&i ^ &2) ^ (hi h bi ^ h 2 /2 &2)) 

— > Vfli a2-{ai 02) ^ (HFix hi a\ — *■ FlFix /12 02) 
mapHFix maph mapa (HIn v) 

= HIn (maph (mapHFix maph) mapa v) 

Finally, applying mapHFix to mapSequF we obtain the desired function. 

mapSequ' :: Vai a2-(«i — > ^2) — > (Sequ' a\ Sequ' 02) 
mapSequ' = mapHFix mapSequF 

Now, let us define a polytypic version of map. The monotypic instances above 
already indicate that the type of the mapping function depends on the kind of 
the type index. In fact, the type of map can be defined by induction on the 
structure of kinds. A note on notation: we will write type and kind indices in 
angle brackets. Hence, map(t :: n) denotes the application of the polytypic map 
to the type t of kind k. We use essentially the same syntax both for polytypic 
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values and for polykinded types. However, they are easily distinguished by their 
‘types’, where the ‘type’ of kinds is given by the superkind ‘n’ (‘*’ and ‘n’ are 
sometimes called sorts). 

What is the type of map if the type-index has kind *? For a manifest type, 
say, t, the mapping function map{t :: *) equals the identity function. Hence, its 
type is t —f t. In general, the mapping function map{t :: n) has type Map{n) t t, 
where Map{n) is defined as follows. 

Map{K :: □) n ^ k ^ -k 

Map{k) ti t2 = h ^ t2 

Map{ni — > K2) h t2 = Vxi X2.Map{ni) xi X2 — > Map{n2) {h xi) {t2 X2) 

The first line of the definition is the so-called kind signature, which makes precise 
that Map{n :: □) maps two types of kind k to a manifest type. In the base case 
Map{-k) t\ t2 equals the type of a conversion function. The inductive case has 
a very characteristic form, which we will encounter time and again. It specifies 
that a ‘conversion function’ between the type constructors ti and t2 is a function 
that maps a conversion function between xi and X2 to a conversion function 
between ti xi and t2 X2, for all possible instances of xi and X2 - Roughly speaking, 
Map{Ki —f K2) h t2 is the type of a ‘conversion function ’-transformer. It is not 
hard to see that the type signatures we have encountered before are instances of 
this scheme. 

How can we define the polytypic mapping function itself? It turns out that 
the technique described in [8] carries over to the polykinded case, that is, to 
define a polytypic value it suffices to give cases for primitive types, the unit type 
‘1’, sums ‘-I-’, and products ‘x’. To be able to give polytypic definitions in a 
pointwise style, we treat 1, ‘-I-’, and ‘x’ as if they were given by the following 
datatype declarations. 



data 1 =0 

data a + b = Ini a \ Inr b 

data a X b = {a, b) 

Assuming that we have only one primitive type, Int, the polytypic mapping 
function is given by 



map{t :: k) 


:: Map{n) t t 


map{l) 0 


= 0 


map (Int) i 


= i 


map{+) mapa mapb {Ini v) 


= Ini {mapa v) 


map{+) mapa mapb {Inr w) 


= Inr {mapb w) 


map{x) mapa mapb (w, w) 


= {mapa v, mapb w 



This straightforward definition contains all the ingredients needed to derive maps 
for arbitrary datatypes of arbitrary kinds. And, in fact, all the definitions we 
have seen before were automatically generated using a prototype implementation 
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of the polytypic programming extension described in the subsequent sections. 
Finally, note that we can define map even more succinctly if we use a point-free 
style — as usual, the maps on sums and products are denoted (-I-) and (x). 

map{l) = id 

map{Int) = id 

map{+) mapa mapb = mapa + mapb 
map{x) mapa mapb = mapa x mapb 



3 The Simply Typed Lambda Calculus as a Type 
Language 

This section introduces kinds and types. The type system is essentially that of 
Haskell [25] smoothing away some of its irregularities. Haskell offers one ba- 
sic construct for defining new types: datatype declarations. In general, a data 
declaration has the following form. 

data D X\ ... X^i = Qi t\\ . . . tlmi I * ’ ’ I Qn tnl ■ ■ ■ inmn 

This constructs combines no less than four different features: type abstraction, 
type recursion, n-ary sums, and n-ary products. The types on the right-hand 
side are built from type constants (that is, primitive type constructors), type 
variables, and type application. Thus, Haskell’s type system essentially corre- 
sponds to the simply typed lambda calculus with kinds playing the role of types. 
This motivates the following definitions. 

Definition 1. Kind terms are formed according to the following grammar. 

K :■.= *\{K ^ K) 

Type terms are built from type constants and type variables using type ap- 
plication and type abstraction. We annotate type constants and type variables 
with their kinds, that is, a type variable is a pair consisting of a name and a 
kind. If (s, k) is a type constant or a type variable, we define kind (s, k) = k. 

Definition 2. Given a set of kinded type constants C C S* x K and a set of 
kinded type variables X C E* x K type pseudo-terms are formed according to 

T ::= C \ X \ {T T) \ (AX.T) \ fiK \ Vk ■ 

The choice of C is more or less arbitrary; we only require C to include functional 
types. For concreteness, we use 

(17 = (] 1 : : X, Int : : ★, (-1-) :: x — > x — > X', ( x ) — > x — > x, ( — : : ★ — > x — > x } . 

The set of pseudo-terms includes a family of fixpoint operators indexed by kind: 
Pi, corresponds to Fix and to HFix. Usually, we write px.t for p^^Ax-t) 
and similarly for the universal quantifier V^. 
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Definition 3. The set of legal type terms of kind k, notation T{k), is defined by 
T{k) = {t € T I h t:: k} where \~ twn means that the statement t::n is derivable 
using the rules depicted in Fig. 1. The set of monomorphic type terms of kind k, 
denoted T°{k), is the subset ofT{n) of terms not containing occurrences ofW. 



c :: kind c 

h :: (ki — > K2) t 2 ki 
(ti 12 ) :: K 2 



:: (k ^ k) 



(t-const) 

(t — >-ELIm) 
— (t-fix) 



X :: kind x 

t K 



(t-var) 



(Ax.t) :: {kind x —> n) 

(t-ALl) 



(t >-INTRO) 



V«; :: (k — > *) — > * 
Fig. 1. Kind rules 



It is worth noting that since type constants and type variables are annotated 
with their kinds, we do not require a typing environment. Furthermore, note 
that the type arguments of polytypic values will be restricted to monomorphic 
types, see Section 5. 

Given this type language we can easily translate datatype declarations into 
type terms. For instance, the type T defined by the schematic data declaration 
above is modelled by (we tacitly assume that the kinds of the type variables 
have been inferred) 

pD.Axi . . . Xi^.{ti± X * * * X tlmi) “t“ ■ * * “t“ (tnl X * * * X ; 

where ti x ■ ■ ■ x to = 1. For simplicity, n-ary sums are reduced to binary sums 
and n-ary products to binary products. 

In Section 5 we require the following conversion rules for type terms. 

Definition 4. The convertibility relation on type terms, denoted '<-> ( is given 
by the following axioms 

{Ax.t) u ^ t [x -.= u] 
pt ^ t {pt) 

plus rules for reflexivity, symmetry, transitivity, and congruence. 

4 Defining Polytypic Valnes 

The definition of a polytypic value consists of two parts: a type signature, which 
typically involves a polykinded type, and a set of equations, one for each type 
constant. Likewise, the definition of a polykinded type consists of two parts: 
a kind signature and one equation for kind x. Interestingly, the equation for 



(P) 

(t) 
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functional kinds need not be explicitly specified. It is inevitable because of the 
way type constructors of kind k\ K 2 are specialized. We will return to this 
point in the following section. In general, a polykinded type definition has the 
following schematic form. 

Poly{K :: □) :: k ^ k ^ * 

Poly{*) ti ... t„ = .. . 

Poly{ni ^ K 2 ) ti ... = Vxi ... Xn.Poly(Ki) xi ... Xn 

Poly{K 2 ) {k Xi) ... {tn Xn) 



The kind signature makes precise that the kind-indexed type Poly{n :: □) maps 
n types of kind k to a manifest type (for Map («::□) we had n = 2). The polytypic 
programmer merely has to fill out the right-hand side of the first equation. 

Given the polykinded type a polytypic value definition takes on the following 
schematic form. 

poly{t :: k) :: Poly{ti) t ... t 

poly{l) =... 

poly{Int) = . . . 

poly{+) polya polyb = . . . 
poly{x) polya polyb = . . . 
poly{-^) polya polyb = . . . 

Again, the polytypic programmer has to fill out the right-hand sides. To be well- 
typed, the poly{c :: k) instances must have type Poly{n) c ... c as stated in 
the type signature. We do not require, however, that an equation is provided for 
every type constant c in C . In case an equation for c is missing, we tacitly add 
poly{c) = undefined. For instance, map is not defined for functional types. In 
fact, none of the examples in this paper can be sensibly defined for the function 
space constructor (a polytypic function that can be defined for functional types 
is the embedding-projection map, see [9]). 

5 Specializing Polytypic Values 

This section is concerned with the specialization of polytypic values to concrete 
instances of datatypes. We have seen in Section 2 that the structure of each 
instance of map{t) rigidly follows the structure of t. Perhaps surprisingly, the 
intimate correspondence between the type and the value level holds not only 
for map but for all polytypic values. In fact, the process of specialization can 
be seen as an interpretation of the simply typed lambda calculus. The polytypic 
programmer specifies the interpretation of type constants. Given this information 
the meaning of a type term — that is, the specialization of a polytypic value — 
is fixed: type application is interpreted by value application, type abstraction 
by value abstraction, and type recursion by value recursion. Gonsequently, the 
extension of poly to arbitrary type terms is given by (we will refine this definition 
below) 
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poly{x) =poly^ 
polyih t 2 ) = {poly{ti}) {poly{t 2 }) 
poly(Ax.t) = Xpoly^.poly(t) 
poly{p^) = fix . 

Note that we allow only monomorphic types as type arguments. This restriction 
is, however, quite mild. Haskell, for instance, does not allow universal quanti- 
fiers in data declarations. For the translation we use a simple variable naming 
convention, which obviates the need for an explicit environment. We agree upon 
that poly{x) is mapped to the variable poly^. We often write poly^ by concate- 
nating the name of the polytypic value and the name of the type variable as in 
mapa. Of course, to avoid name capture we assume that poly^ is distinct from 
variables introduced by the polytypic programmer. 

The definition of poly makes precise that the specialization for a type con- 
structor is a function. This explains why Poly{ni — > K 2 ) h ... must necessar- 
ily be a functional type. In the rest of this section we show that poly{t) is indeed 
well-typed. To this end we must first fix the target language we are compiling 
polytypic values to. Since we require first-class polymorphism, we will use a vari- 
ant of the polymorphic lambda calculus [6], Fw, augmented by a polymorphic 
fixpoint operator. A similar language is also used as the internal language of the 
Glasgow Haskell Compiler [26]. 

Similar to the type language we annotate value constants and variables with 
their types. If (s, t) is a constant or a variable, we define type (s, t) = t. Note 
that the type of a constant must be closed. 

Definition 5. Given a set of typed constants P C S* x T(*) and a set of typed 
variables V C E* x T(*) pseudo-terms are formed according to 

E ::= P\ V \{E E)\ {XV .E) \ {E T) \ {XX. E) \ fix . 

Here, e t denotes universal application and Xx.e denotes universal abstraction. 
Note that we use the same syntax for value abstraction Xv.e (here w is a value 
variable) and universal abstraction Xx.e (here 2 : is a type variable). The set of 
value constants P should include enough suitable functions to enable defining 
the poly{c) instances for each of the type constants c in C. 

Definition 6. The set of legal terms of type t, notation E{f), is defined by 
E{t) = {e ^ E \ \- e::t'\ where \- e::t means that the statement e::t is derivable 
using the rules depicted in Eig. 2. 

The type signature of a polytypic value determines the type for closed type 
indices. However, since the specialization is defined by induction on the structure 
of type terms, we must also explicate the type for type indices that contain 
free type variables. To motivate the necessary amendments let us take a look 
at an example first. Consider specializing map for the type Matrix given by 
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p :: type p 

ei :: (ii — > 12) 62 :: h 



(const) 



(ei 62) :: i2 
e u :: fcind x 

(e u) :: t [x := u] 



(^-ELIm) 



V :: type v 

e t 



(var) 



(V-elim) 



{\v.e) :: {type v —> t) 
e t X not free in the type 



{Xx.e) :: {'ix.t) of a free variable of e 
(fix) 



(— >-INTRO) 

(V-INTRO) 



fix :: Va.(a — > a) — > a 

e t t ^ u , . 

(CONV) 

e u 

Fig. 2. Typing rules 



Aa.List {List a). The fully typed definition of mapMatrix is 

map Matrix :: Vai a2.{a\ — > 02) {Matrix ai — > Matrix 02) 
mapMatrix = \a\ a2.Xmapa :: a± — > a2.mapList {List ai) {List 02) 

{mapList a\ 02 mapa) . 

First of all, the type of mapMatrix determines the type of mapa, which is 
Map{-k) a\ fl2 = fli fl2- Now, Matrix contains the type term List a, in which a 
occurs free. The corresponding mapping function is mapList a± 02 mapa, which 
has type List ai — > List a2- In general, poly{t :: k) has type Poly{K,) {t)i ... (t)„ 
where {t)i denotes the type term t, in which every free type variable x has been 
replaced by Xi. To make this work we lay down that the value variable poly,^ has 
type Poly {kind x) x\ ... Xn and that the Xi are fresh type variables associated 
with X. Given these prerequisites the fully typed extension of poly is defined by 

poly{t::K) :: Poly{n) {t)i ... (f)„ 
poly{x) =poly,, 

poly{h fe) = {polyih)) {t2)i ... {t2)n {poly{t2)) 
poly {Ax. t) = Xxi . . . Xn-\poly „..poly {t) 

polyitL^} = Xxi ... Xn.Xpoly„,.fix {Poly{n) {pxi) ... {ptXn)) 

{poly,, ilJ-Xl) . ■ . {pXn)) . 

Theorem 1 . Lf poly{c) S E{Poly{kind c) c ... c) for all type constants cG C, 
then poly{u) G E{Poly{n) (it)i . . . {u)n) for all type terms u G T°{k). 

Proof. We use induction on the derivation oi u k. We content ourselves with 
one illustrative case. 

Case u = Pk'. Note that kind xt = kind x = k k (the kinds of the type 
variables do not appear explicitly in the definition of poly). By convention, we 
have poly^ :: Poly{K — > k) xi ... x„. Consequently, 

poly, {pxi) . . . {pxn) 

:: Polyin) {pxi) . . . (/rx„) ^ Poly{K) (xi {pxi)) ... (x„ {pXn)) 
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and since t (fit) and fix :: Va.(a a) a we obtain 

Axi ... Xn.Xpoly^.fix {Poly{n) (nxi) ... {ytXn)) {poly^ (p,xi) ... {pLXn)) 
:: Vii . . . Xn.Poly{K ^ k) xi . . . Xn ^ Poly{n) {p,xi) . . . (pxn) 



as desired. □ 

Let us conclude the section by noting a trivial consequence of the special- 
ization. Since the structure of types is reflected on the value level, we have 
poly{Aa.f {g a)) = Xpoly ^.poly{f) {{poly{g}) polyj. Writing composition as 
usual this implies, in particular, that map{f ■ g) = map{f) ■ map{g). Perhaps sur- 
prisingly, this relationship holds for all polytypic values, not only for mapping 
functions. A similar observation is that poly(Aa.a) = Xpoly ^.poly ^ for all poly- 
typic values. Abbreviating Aa.a by Id we have, in particular, that map{Id) = id. 
As an aside, note that these polytypic identities are not to be confused with the 
familiar functorial laws map{f) id = id and map{f) = map{f) Lp-map{f) ^|J 

(see Section 7.1), which are base-level identities. 



6 Examples 



6.1 Polytypic Equality 

The equality function equal serves as a typical example of a polytypic value. The 
polykinded equality type is fairly straightforward: for a manifest type equal{t) 
has type t ^ t ^ Bool, which determines the following definition. 

Equal { k, :: □) :: k —f -k 

Equal {-k) t = t ^ t ^ Bool 

Equal{ni -k K2) t = yx.Equal{Ki) x —>■ Equal{K2) {t x) 

For ease of reference we will always list the equation for functional kinds even 
though it is fully determined by the theory. Assuming that a suitable equality 
function for Int is available, the polytypic equality function can be defined as 
follows (for reasons of readability we use a Haskell-like notation rather than FlS) . 



equal {t :: k) 
equal {!) () () 
equal (Int) i j 
equal {+) equala 
equal {+) equala 
equal {+) equala 
equal {+) equala 
equal (x) equala 



equalb {Ini wi) {Ini V2) 
equalb {Ini Wi) {Inr W2) 
equalb {Inr wi) {Ini V2) 
equalb {Inr wi) {Inr W2) 
equalb {vi, wi) {v2,W2) 



Equal { k) t 
True 

equalint i j 
equala v\ V2 
False 
False 

equalb wi UI2 

equala v\ V2 A equalb Wi W2 



Now, since equal has a kind-indexed type we can also specialize it for, say, unary 
type constructors. 



equal{f ^ :: yx.{x -k x ^ Bool) ^ {f x ^ f x —>■ Bool) 
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This gives us an extra degree of flexibility: equal {f) op v w checks whether 
corresponding elements in v and w are related by op. Of course, op need not 
be an equality operator. PolyLib [12] defines an analogous function but with a 
more general type: 

pequal{f ::*—>*) :: \fxi X2-{xi X2 ^ Bool) ^ (/ Xi — > / 2:2 — > Bool) . 

Here, the element types need not be identical. And, in fact, equal {t :: k) can be 
assigned the more general type PEqual (k) t t given by 

PEqual{n :: □) :: k ^ k * 

PEqual {*) t\ t2 = — *■ t2 Bool 

PEqual{Ki — > K2) h t2 = Vii X2-PEqual{Ki) xi X2 — *■ PEqual{n2) {h x\) {t2 X2) , 
which gives us an even greater degree of flexibility. 

6.2 Mapping and Zipping Functions 

In Section 2 we have seen how to define mapping functions for types of arbitrary 
kinds. Interestingly, the polytypic map subsumes so-called higher-order maps. 
A higher-order functor operates on a functor category, which has as objects 
functors and as arrows natural transformations. In Haskell we can model natural 
transformations by polymorphic functions. 

type /i /2 = Va./i a^ f2 a 

A natural transformation between functors Ei and E2 is a polymorphic function 
of type El — > ^ 2 - A higher-order functor H then consists of a type constructor 
of kind (*—>*) — > {-k -k), such as SequE, and an associated mapping function 

of type ifi ^ f 2 ) ^ {H fi ^ H f 2 ). Now, the polytypic map gives us a function 
of type 



map{H) :: V/i /2.(V&i &2.(&i ^ &2) ^ (/i h ^ /2 62)) 

— > (Vfli a2-(fli ^ «2) ^ /i fli ^ H /2 02 )) ■ 

Given a natural transformation a of type fi f2 there are basically two alter- 
natives for constructing a function of type V&i & 2 -(&i ^ ^ 2 ) — *■ (A A ^ A A): 
Am. a • map{fi) m or Xm.map{f2) rn ■ a. The naturality of a, however, implies 
that both alternatives are equal. Consequently, the higher-order map is given by 

hmap{h :: {-k ^ -k) ^ {-k ^ *)) :: VA A-(A A) ^ {h fi h A) 

hmap{h) {a :: A ^ A) = 'map{h) {Xm.a ■ maplji) m) id . 

Using definitions similar to the one in Section 2 we can also implement 
embedding-projection maps [9] of type MapE{-k) ti t 2 = (A ^ A, A ^ A), 
monadic maps [5,20] of type MapM (*) A A = A — *■ Af A for some monad M, 
and arrow maps [13] of type MapA{-k) A A = A A for some arrow type ('^). 
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Closely related to mapping functions are zipping functions. A binary zipping 
function takes two structures of the same shape and combines them into a single 
structure. For instance, the list zip takes a function of type ai — > «2 ^ 03, two 
lists of type List a\ and List and applies the function to corresponding ele- 
ments producing a list of type List «3 . The type of the polytypic zip is essentially 
a three parameter variant of Map. 



Zip{n ::□) ::k— 

Zip{-k) ti t2 h = h ^ t2 ^ h 

Zip{Ki — > K2)ht2h = \/xiX2X3.Zip{ni)xiX2X3 Zip{K2){tiXi){t2X2){t3X3) 



The definition of zip is similar to that of equal. 



zip{t :: k) 
zip{l) 0 0 
zip {Lnt) i j 
zip{+) zipa 
zip{+) zipa 
zip{+) zipa 
zip{+) zipa 
zip{x) zipa 



zipb {Lnl wi) {Lnl V 2 ) 
ziph {Lnl 111 ) {Lnr W 2 ) 
zipb {Lnr wi) {Lnl V 2 ) 
zipb {Lnr wi) {Lnr W 2 ) 
zipb {vi, wi) (w 2 , W 2 ) 



Zip{n) t t t 

0 

if equalLnt i j then i else undefined 

Lnl {zipa ui V 2 ) 

undefined 

undefined 

Lnr {zipb Wi W 2 ) 

{zipa vi V2,zipb Wi W2) 



Note that the result of zip is a partial structure if the two arguments have 
not the same shape. Alternatively, one can define a zipping function of type 
Zip{-k) t\ t 2 ^3 = > ^2 — ^ Maybe ts, which uses the exception monad Maybe 

to signal incompatibility of the argument structures, see [7] . 



6.3 Reductions 

A reduction or a crush [18] is a polytypic function that collapses a structure of 
values of type x into a single value of type x . This section explains how to define 
reductions that work for all types of all kinds. To illustrate the main idea let us 
start with three motivating examples. The first one is a function that counts the 
number of values of type Lnt within a given structure of some type. 

Count{n :: □) 

Count {x) t 
Count{K\ — > K 2 ) t 
count {t :: k) 
count{l) 0 
count {lnt) i 

count {+) counta countb {Lnl v) 
count {+) counta countb {Lnr w) 
count{x) counta countb {v,w) 



:: K -k 

= t ^ Lnt 

= \/x.Count{Ki) X -k Count{K 2 ) {t x) 
:: Count { k) t 
= 0 
= 1 

= counta V 
= countb w 

= counta V + countb w 
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Next, let us consider a slight variation: the function size{t) defined below is 
identical to count {t) except for t = Int, in which case size also returns 0. 



size{t :: k) :: Count{K,) t 

size{l) 0 =0 

size {Int) i =0 

size{+) sizea sizeb {Ini v) = sizea v 

size{+) sizea sizeb {Inr w) = sizeb w 
size{x) sizea sizeb {v, w) = sizea v + sizeb w 



It is not hard to see that size{t) v returns 0 for all types t of kind * (provided 
V is finite and fully defined). So one might be led to conclude that size is not a 
very useful function. This conclusion is, however, too rash since size can also be 
parameterized by type constructors. For instance, for unary type constructors 
size has type 



size{f ^ :: \/x.{x Int) —>■ {f x —>■ Int) 

Now, if we pass the identity function to size, we obtain a function that sums up 
a structure of integers. Another viable choice is const 1; this yields a function 
of type Vz./ X Int that counts the number of values of type x in a given 
structure of type / x. 



fsum{f ::*—>*) :: / Int Int 

fsum{f) = size{f) id 

fsize{f ^ :: Vx./ x Int 

fsize{f) = size{f) {const 1) 

Using a similar approach we can also flatten a structure into a list of elements. 
The type of the polytypic flattening function 

Flatten z{k :: □) :: k ^ x 

Flattenz{-k) t = t ^ List z 

Flattenz{Ki — *■ K2) t = \/x. Flatten z{ki) x — *■ Flattenz{n 2 ) {t x) 

makes use of a simple extension: Flattenz{n) takes an additional type parameter, 
z, that is passed unchanged to the base case. One can safely think of z as a type 
parameter that is global to the definition. The code for flatten is similar to the 
code for size. 

flatten{t :: k) :: \/z. Flatten z{k) t 

flatten {!) () = Nil 

flatten{Int) i = Nil 

flatten {+) flattena flattenb {Ini v) = flattena v 

flatten{+) flattena flattenb {Inr w) = flattenb w 

flatten {x) flattena flattenb {v,w) = flattena v H flattenb w 
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fsum{f ^ 

fsum{f) 


:: / Int — » Int 
= reduce {f) 0 (+) id 


fsize{f ^ 

fsize{f) 


:: Va./ X Int 
= reducetj) 0 (+) {const 1) 


fand{f ^ 

fand{f) 


:: / Bool —> Bool 
= reduce{f) True (A) id 


falUf ::*—>*) 
fall{f} P 


:: Va.(a — > Bool) — > (/ a — » Bool) 
= reducelj) True (A) p 


fflatten{f ^ 

fflattenlf) 


:: Va./ a ^ List a 
= reducelj) Nil (-H-) wrap 


gflatten{g ^ 

gflatten{g) 


*) :: Va y.g a y — > List (a + y) 

= reduce{g) Nil (-H-) {wrap ■ Ini) {wrap ■ Inr) 



Fig. 3. Examples of reductions 



As before, flatten is pointless for types but useful for type constructors. 
fftatten{f ■.:*—>■*) :: Va./ a — > List a 

fflatten{f) = flatten (f) wrap where wrap v = Cons v Nil 

The definitions of size and flatten exhibit a common pattern: the elements of 
a base type are replaced by a constant (0 and Nil, respectively) and the pair 
constructor is replaced by a binary operator ((+) and (-H-), respectively). The 
polytypic function reduce abstracts away from these particularities. 

Redz («:::□) 

Redz{*) t 
Redz{ni K 2 ) t 
reduce {t :: k) 
reduce{t) e (0) 
where 
red{l) 0 
red{Int) i 

red{+) reda redb (Ini v) 
red{+) reda redb {Inr w) 
red{x) reda redb (v, w) 

Note that we can define red even more succinctly using a point-free style. 

red{l) = const e 

red {Inf) = const e 

red{+) reda redb = reda V redb 
red{x) reda redb = uncurry (0) • {reda x redb) 



:: K — > * 

= t ^ z 

= \/x.Redz{Ki) X — > Redz{K 2 ) {t x) 

:: \/z.z {z ^ z ^ z) ^ Redz{K) t 
= red{t) 

= e 
= e 

= reda v 
= redb w 

= reda v 0 redb w 
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Here, (v) is the so-called junction operator [2]. The type of reduce {f) where / is 
a unary type constructor is quite general. 

reduceij w-k ^ -k) :: Mz.z [z ^ z ^ z) ^ iyx.{x z) ^ {f x ^ z)) 

Fig. 3 lists some typical applications of reduceij) and reduce (g) where 5 is a 
binary type constructor. 

7 Properties of Polytypic Values 

This section investigates another important aspect of polytypism: polytypic rea- 
soning. The section is structured as follows. Section 7.1 shows how to generalize 
the functorial laws to datatypes of arbitrary kinds and explains how to prove 
these generalizations correct. In Section 7.2 we develop a general framework for 
conducting polytypic proofs. This framework is then illustrated in Section 7.3 
by several examples. 

7.1 Ftinctorial Laws 

To classify as a functor the mapping function of a unary type constructor must 
satisfy the so-called functorial laws: 

map if) id = id 

map if) {if -J) = map if) f ■ map{J) J , 

that is, map if) preserves identity and composition. If the type constructor is 
binary, the functor laws take the form 

map{g) id id = id 

map{g) {fi ■ tpi) {if2 ■ J’2) = map{g) fi f2 ■ map{g) tp2 ■ 

How can we generalize these laws to datatypes of arbitrary kinds? Since 
map{t) has a kind-indexed type, it is reasonable to expect that the functorial 
properties are indexed by kinds, as well. So, what form do the laws take if the 
type index is a manifest type, say, tl In this case map{t) does not preserve 
identity; it is the identity. 



map{t) = id 

map{t) = map{t) ■ map{t) 

The pendant of the second law states that map{t) is idempotent (which is a 
simple consequence of the first law). Given this base case the generalization to 
arbitrary kinds is within reach. The polytypic version of the first functorial law 
states that map{t) G 1{k) for all types t G T°{k), where I{k) is given by 

f G 1{k) = f = id 

f G T{ki -k K 2 ) =Wv.v G T{ki) D (f V G I{k2) ■ 
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The relation X strongly resembles a so-called unary logical relation [22]. The 
second clause of the definition is characteristic for logical relations; it guarantees 
that the relation is closed under type application and type abstraction. We will 
call X and its colleagues polytypic logical relations (or simply logical relations) 
for want of a better name. Section 7.2 details the differences between polytypic 
and ‘classical’ logical relations. 

Similarly, the polytypic version of the second fimctorial law expresses that 
{map{t), map{t), map{t)) G C{k) for all types t G T°{k), where C{k) is given by 

((/?1,(/32,<P3) G C(*) =(pi-(fi2 = ‘P3 

G C{ki K 2 ) = yvi V2 V3.{vi,V2,V3) € C(ki) 

D {(fil Vi,(fi2 V 2 ,(f 3 V3) & C{k2) ■ 



Note again the characteristic form of the second clause. It expresses that func- 
tions are related if related arguments are taken to related results. It is not hard 
to see that the monotypic fimctorial laws are instances of these polytypic laws. 

Turning to the proof of the polytypic laws we must show (1) that map(c) € 
X{kind c) and {map{c), map{c), map{c)) € C{kind c) for all type constants c G C 
and (2) that fix G X{k) and {fix ^ fix ^ fix) G C{k) for all k G K. The first part 
follows directly from the functorial laws of (-I-) and (x). The second part is 
more delicate. The usual approach is to impose two further conditions on logical 
relations: they must be pointed and chain-complete^ — we tacitly assume that 
we are working in a domain-theoretical setting. Then one can invoke fixpoint 
induction to show that fixpoint operators are logically related. Now, C satisfies 
these requirements; X, however, does not since it fails to be pointed (clearly, 
J- = id does not hold in general). 

Let us take a closer look at the reason for this failure. To this end consider the 
type Tree = p,TreeF with TreeF = Aa.l + a y. a. The associated mapping func- 
tion is map Tree = fix map TreeF with map TreeF = Xmapa.id + mapa x mapa. 
Now, it is fairly obvious that we cannot prove mapTree = id with the machinery 
developed so far. In a sense, the problem is that equality is only ‘weakly typed’: 
we must prove ip v = v for all u :: Tree. Clearly, the approximations of mapTree, 
mapTreeF^ T, do not satisfy this strong condition. However, mapTreeF^ X v = v 
holds for all elements v :: TreeF'’ 0. Here, 0 is the ‘empty’ type, which con- 
tains T as the single element. In other words, approximations on the value 
level and approximations on the type level should go hand in hand. This can 
be accomplished by parameterizing logical relations with the types involved: 
ip G X{y) t = ip = id :: t t. Given this modification it is not hard to show 
that I(*) is pointed, that is, T G X(*) 0. We can then invoke a polytypic version 
of Scott and de Bakker’s fixpoint induction to conclude that fix G X{k) p, for 
dX K G K . Polytypic fixpoint induction can be stated as the following inference 
rule: 



T G P(0) Vx.Vu.u G P{x) XipvG P{t ; 
fixipGP{pf) 



(fp-induction) 



^ A relation 77 is pointed if T G 77; it is chain-complete ifSC77D|JS'G77 for every 
chain S. 
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where P is a type-indexed chain-complete predicate. Here, chain-completeness 
means that whenever _L € P (P 0) for alH € N then fix ip G P (pt) . 

7.2 Polytypic Logical Relations 

Logical relations were originally developed for the study of typed lambda calculi. 
The interested reader is referred to J. Mitchell’s textbook [22] for a comprehen- 
sive treatment of logical relations. The basic idea is as follows. Say, we are given 
two models of the simply typed lambda calculus. Then the Basic Lemma of 
logical relations establishes that the meaning of a term in one model is logi- 
cally related to its meaning in the other model (provided the meanings of the 
constants are logically related). 

We have discussed in Section 5 that the specialization of a polytypic value 
can be seen as an interpretation of the simply typed lambda calculus. Actually, 
the interpretation is a two-stage process: the specialization maps a type term to a 
value term, which is then interpreted in some fixed domain-theoretic model. Note 
that we require domain-theoretic models in order to cope with (type) recursion. 
The only additional requirement is that the model must satisfy the polytypic 
fixpoint induction rule. 

Models based on universal domains such as Puj are suitable for this purpose, 
see [1,22]. These models allow to interpret types as certain elements (closures 
or finitary projections) of the universal domain, so that type recursion can be 
interpreted by the (untyped) least fixpoint operator operator. Then the polytypic 
fixpoint induction rule is just a two-argument version of the ordinary fixpoint 
induction rule. 

There are basically two differences to the ‘classical’ notion of logical relation. 
(1) We do not relate elements in two different models but different elements (ob- 
tained via the specialization) in the same model, that is, for some fixed model the 
meaning of polyi{t) is logically related to the meaning of poli/ 2 {t). (2) The type 
of polyi{t) and poly 2 {t) and consequently the type of their meanings depends on 
the type-index t. For that reason polytypic logical relations are parameterized by 
types (respectively, by the meaning of types) . We have motivated this extension 
at some length in the previous section. 

Now, for presenting the Basic Lemma we will use a ‘semantic version’ of poly, 
where the meaning of poly{c) is an element in some domain. 

poly{x) T] = r/ X 

poly{ti t 2 ) p = {poly{ti) rf) {poly{t 2 ) p) 
poly {Ax. t) T] = Xv.poly{t) {r/{x := v)) 

Poly{p>,) Tq =fix 

Here, q is an environment that maps type variables to values, and q{x : = v) 
is syntax for extending the environment q by the binding x \= v. It is un- 
derstood that the constructions on the right-hand side denote semantic val- 
ues, for instance, Xv.poly{t) {q{x := w)) denotes a function mapping v to 
poly{t) {q{x := v)). For closed type terms we write poly{t) instead oipoly{t) q. 
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In presenting logical relations we will restrict ourselves to the binary case. 
The extension to the n-ary case is, however, entirely straightforward. 

Definition 7 . A binary logical relation TZ over T>i and T>2, where T>i and T>2 
are kind-indexed types, is a kind-indexed relation such that 

1 . TZ{k) ti ... tn ^Vi{k) h ... tnX T> 2 (k) h . . . t„, 

2. G TI{ki ^ K2) h ... tn 

= Vxi . . . Xn.yvi V2.{vi,V2) G TZ{ki) X\ . . . Xn 

D {ipi Vi,tp2 V2) G 'R{k 2) {ti a;i) ... {tn Xn) , 

3 . TZ is pointed, 

4 -. TZ is chain- complete. 

Usually, a logical relation will be defined for the base case *; the second clause 
of the definition then shows how to extend the relation to functional kinds. 
The third and the fourth condition ensure that a logical relation relates fixpoint 
operators. It is generally easy to prove that a relation is pointed: note that 
TZ{ki K2) is pointed if TZ{k2) is pointed; hence it suffices to show that TZ{-k) 
is pointed. Similarly, TZ{ni K2) is chain-complete if TZ{k\) and TZ{k2) are 
chain-complete; TZ{x) is chain-complete if it takes the form of an equation. 

Lemma 1 (Basic Lemma). Let TZ he a logical relation over Poly^ and Poly2 
and let polyi{t) :: Polyi{n) t ... t and poly2{t) :: Poly2{K) t ... t be such that 

{poly i{c) , poly 2(c)) G TZ{kind c) c ... c 

for every type constant c € C. If rji and 772 are environments and 61, ..., On are 
type substitutions such that {rji x,r]2 x) G TZ{kind x) {xOi) ... {xOn) for every 
type variable x G free(t), then 

{polyiit) ■qi,poly2{t) 772) G TZ{k) {tOi) . . . {tOn) 

for every type term t € T°{k). 

Proof. We use induction on the derivation of u k. 

Case u = c: the statement holds trivially since TZ relates constants. 

Case u = X'. the statement holds since rji and 772 are related. 

Case u = ti t2'. hy the induction hypothesis, we have 

{polyj^{ti) 771,730/7/2(^1) m) G TZ{ki K2) {hOi) . . . {tiOn) 

= Va:i . . . Xn.^vi V2. 

{vi,V2) G TZ{ki) Xi ... Xn 

D {{polyi{ti) 771) 771, (730/7/2 (U) 772) 732) G TZ{k 2) {{hdl) Xi) ... {{hOn) Xn) 

and 

{P0lyi(t2) 771, 730/7/2 (^2) 772) G TZ{ki) {t 20 i) . . . {t 20 n) , 
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which implies {poly^{ti 12) rji^poly^ik 12) m) G 'H{n2) {{h t 2 ) 9 i) . . . {{ti t2)0n)- 
Case u = Ax.t: we have to show that 

{poly-^{Ax.t) T]i,poly2{Ax.t) 772) € TZ{kind 2; — > k) {{Ax.t) 9 i) . . . ((Ax.t) 9 n) 

= \/Xi . . . Xn.^Vi V 2 . 

(vi,V2) G TZ{kind x) x\ ... Xn 

3 {{polyiiAx.t) T]i) vi, {poly2{Ax.t) 772) V2) 

GTZ{k) {{{Ax.t) 9 i) Xi) ... {{{Ax .t) 9 n) Xn) . 

Assume that {vi,V2) G TZ{kind x) x\ ... Since the modified environments 
771(2: := ui) and 772(2; : = V2) are related, we can invoke the induction hypothesis 
to obtain 

{poly lit) (771(2: := Vi)),poly2{t) (772(2; := W2))) 

G TZ{k) {t{ 9 i{x -.= Xi))) ... {t{ 9 n{x -.= Xn))) ■ 

Now, since {{Ax.t) 9 i) Xi = t{ 9 i{x := Xi)), the statement follows. 

Case u = Pk,'. 'we have to show that 

{fix, fix) G TZ{{k ^ k) ^ k) p ... p 
= \/xi . . . Xn.'^^l 722.(791,722) G ^ k) 2:1 .. . Xn 

D {fix (pi,fix 722) G TI{k) {pxi) . . . {pXn) ■ 

Assume that (p\ and 722 are related. By definition of logical relations, we have 

(721, 722) G TI{k ^ Hi) Xi ... Xn 
= Vt/ 1 ... T/n.Vui 772. 

{vi,V2) G TZ{k) 7/1 ... y„ 

D (721 771,722 772) G TZ{k) {xiyi) ... {Xn Vn) ■ 

Since TZ is pointed, we furthermore have (A, A) G 7 ?.(k) 0 ... 0. The polytypic 
fixpoint induction rule then implies {fix <pi,fix 792) G TZ{k) {pxi) . . . {pXn). □ 



7.3 Examples 

Mapping Functions The functorial laws are captured by the logical relations 
I and C, which we have discussed at length in Section 7.1. 

2{-k) t C t t 

ip G T{-k) t = p = id :: t ^ t 

C(*) ti t2 ts C {t2 tfi) X (A ^ 12 ) X {ti tfi) 

(721,792,793) e C(*) h t2 fa = 791 • 792 = 793 :: fi 



It is not hard to see that both T and C are pointed and chain-complete. Since 
map{c) G 2{kind c) c and {map {c) , map {c) , map {c)) G C{kind c) c c c for 
all type constants c G C, the Basic Lemma implies map{t) G 2{n) t and 
{map{t), map {t), map {t)) gC{k) ttt for all closed type terms t G T°{k). 
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Reductions Using a minor variant of C we can also relate reductions and map- 
ping functions. 



C^(*) h t2 C (t2 ^ z) X {ti t2) X (ti ^ z) 

{ipi,ip 2 ,V 3 .) G C^(*) tl t 2 = v>l ■ V >2 = <<33 Z 

Here, 2: is some fixed type that is passed unchanged to the base case. Now, given 
an element e :: 2 and an operation ( 0 ) ;; 2 — > 2 ^ 2, we have 

{reduce{t) e {(B), map (t), reduce (t) e ( 0 )) GCz{K)tt . 

An immediate consequence of this property is 

reduce (f) e ( 0 ) ip ■ map(J) ip = reduce {f) e ( 0 ) {p ■ ip) , 

which shows how to fuse a reduction with a map. Now, in order to prove the 
polytypic property we merely have to verify that the statement holds for every 
type constant c £ C. Using the point-free definitions of map and red this amounts 
to showing that 

const e ■ id = const e 

{pi V P2) ■ {ipl + 1P2) = {pi ■ Ipl) V {p2 ■ 1P2) 

uncurry ( 0 ) • {px X P2) ■ {ipi X 1P2) = uncurry ( 0 ) • (( 1^1 • ipi) X {p2 ■ 1P2)) ■ 

All three conditions hold. 

Previous approaches to polytypic programming [ 11 , 7 ] required the program- 
mer to specify the action of a polytypic function for the composition of two type 
constructors: for instance, for fsize the polytypic programmer had to supply the 
equation fsize{fi • f2) = fsum{fi) ■ map{fi) {fsize{f2)). Interestingly, using reduce- 
map fusion this equation can be derived from the definitions of fsize and fsum 
given in Fig. 3 . 



fsize {fi ■f2) 

= { definition fsize } 

reduce{fi ■ f2) 0 ( 0 ) {const 1 ) 

= { volyifi ■ /2) = poly(fi) ■ poly{f2) } 

reduce{fi) 0 ( 0 ) {reduce{f2) 0 ( 0 ) {const 1 )) 
= { definition fsize } 

reduce (fi) 0 ( 0 ) {fsize {f 2)) 

= { reduce- map fusion } 

reduceifi) 0 ( 0 ) id ■ map{fi) {fsize{f2)) 

= { definition fsum } 

fsum{fi) ■ map if 1) {fsize{f2)) 
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As a final example let us generalize the fusion law for reductions given by 
L. Meertens in [18]. To this end we use the logical relation T defined by 

t Z2) 

€ ^ZuzA^) t=h-ipi=ip2'.-.t^Z2 , 

where Z\ and 22 are fixed types and /i :: 21 — *■ 22 is a fixed function. The polytypic 
fusion law, which gives conditions for fusing the function h with a reduction, 
then takes the following form (to ensure that T is pointed h must be strict) 

h± = ± 
f] h n = e 

r\h{v(Bw) = hv($ihw 

D (reduce{t) n {(B), reduce (t) e (®)) G ^z,,zg{fi) t . 

We can apply this law, for instance, to prove that length ■ jflatten (/) = fsize (/) . 



8 Related Work 

The idea to assign polykinded types to polytypic values is, to the best of the au- 
thor’s knowledge, original. Previous approaches to polytypic programming [11,8] 
were restricted in that they only allowed to parameterize values by types of one 
fixed kind. Three notable exceptions are Functorial ML (FML) [14], the work 
of F. Ruehr [27], and the work of P. Hoogendijk and R. Backhouse [10]. FML 
allows to quantify over functor arities in type schemes (since FML handles only 
regular, first-order functors, kinds can be simplified to arities). However, no for- 
mal account of this feature is given and the informal description makes use of 
an infinitary typing rule. Furthermore, the polytypic definitions based on this 
extension are rather unwieldy from a notational point of view. F. Ruehr also re- 
stricts type indices to types of one fixed kind. Additional flexibility is, however, 
gained through the use of a more expressive kind language, which incorporates 
kind variables. This extension is used to define a higher-order map indexed by 
types of kind (a — > *) — > *, where a is a kind variable. Clearly, this mapping 
function is subsumed by the polytypic map given in Section 2. Whether kind 
polymorphism has other benefits remains to be seen. Finally, definitions of poly- 
typic values that are indexed by relators of different arities can be found in the 
work of P. Hoogendijk and R. Backhouse on commuting datatypes [10]. 

The results in this paper improve upon my earlier work on polytypic pro- 
gramming [8] in the following respects. As remarked above the previous work 
considered only polytypic values indexed by types of one fixed kind. Furthermore, 
the approach could only handle type indices of second-order kind or less and type 
constants (that is, primitive type constructors) were restricted to first-order kind 
or kind *. Using polykinded types all these restrictions can be dropped. 
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9 Conclusion 

Haskell possesses a rich type system, which essentially corresponds to the simply 
typed lambda calculus (with kinds playing the role of types) . This type system 
presents a challenge for polytypism: how can we define polytypic values and how 
can we assign types to these values? This paper offers satisfactory answers to 
both questions. It turns out that polytypic values possess polykinded types, that 
is, types that are defined by induction on the structure of kinds. Interestingly, to 
define a polykinded type it suffices to specify the image of the base kind; likewise, 
to define a polytypic value it suffices to specify the images of type constants. 
Everything else comes for free. In fact, the specialization of a polytypic value 
can be regarded as an interpretation of the simply typed lambda calculus. This 
renders it possible to adapt one of the main tools for studying typed lambda 
calculi, logical relations, to polytypic reasoning. To prove a polytypic property 
it suffices to prove the assertion for type constants. Everything else is taken care 
of automatically. We have applied this framework to show among other things 
that the polytypic map satisfies polytypic versions of the two functorial laws. 
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Abstract. Many have recognized the need for genericity in program- 
ming and program transformation. Genericity over data types has been 
achieved with polymorphism. Genericity over type constructors, often 
called polytypism, is an area of active research. This paper proposes that 
another kind of genericity is needed: genericity over the length of tuples. 
Untyped languages allow for such genericity but typed languages do not 
(except for languages allowing dependent types). The contribution of 
this paper is to present the “zip calculus,” a typed lambda calculus that 
provides genericity over the length of tuples and yet does not require the 
full generality of dependent types. 



1 Introduction 

The key to writing robust software is abstraction, but genericity is often needed 
to use abstraction: to write a generic sort routine, genericity over types is needed 
(i.e., polymorphism); to write a generic fold (or catamorphism, a function induc- 
tively defined over an inductive data structure), genericity over type constructors 
(e.g., List and Tree where List a and Tree a are types) is needed — this is often 
called polytypism. 

In program transformation the need for genericity is amplified. For example, 
in a monomorphic language, one cannot write a polymorphic sort but must 
write sortint, sortFloat, and etc. One will have laws about sortint and 
sortFloat instead of just one law about a generic sort; also one must transform 
sortint and sortFloat separately, even if the program derivation can be “cut- 
and-pasted”. So, the ability to write a generic function, sort, reduces not only 
program size, but also the number of laws and the length of program derivations. 

Consequently, the program transformation community — notably the Bird- 
Meertens Formalism (or Squiggol) community [3,12,13] — has been working to 
make programs more generic: not just polymorphic, but polytypic [8,9,10,11]. 
However, the genericity provided by polymorphism and polytypism is still not 
adequate to achieve certain abstractions; another form of genericity is often 
needed — genericity over the length of tuples. This paper shows the usefulness of 
“n-tuples” (tuples whose lengths are unknown) and proposes a method to extend 
a programming language with n-tuples. 

* This research was supported in part by NSF under Grant Number CCR-9706747. 
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Section 2 gives examples of the usefulness of n-tuples. Section 3 describes a 
typed lambda calculus, “the zip calculus”, which gives genericity over n-tuples. 
Section 4 returns to the examples and shows what programs, laws, and program 
derivations look like using the zip calculus; other applications are also presented, 
including how to generalize catamorphisms to mutually recursive data types. 
Finally, section 5 discusses some limitations and compares this work to related 
work. 



2 Why Are N-Tuples Needed? 

An n-tuple is a tuple whose length is unknown. This section presents the useful- 
ness of n-tuples: just like polymorphism and polytypism, n-tuples result in more 
general programs (2.1), more general laws about those programs (2.2), and more 
general program derivations (2.3). 



2.1 More General Programs 

The following functions are defined in the Haskell [6] Prelude and Libraries 

zip :: ( [a] , [b] [(a,b)] 

zip3 :: ( [a] , [b] , [c] [(a,b,c)] 

zip7 :: ( [a] , [b] , [c] , [d] , [e] , [f ] , [g] [(a,b, c ,d, e ,f ,g)] 

which combine lists element-wise.^ Also, there are the family of functions unzip, 
unzipS, . . . and the family of functions zipWith, zipWithS, .... To write the 
zip3, . . . , zip7 functions is not hard but tedious. It is clearly desirable to ab- 
stract over these and write one generic zip, one generic zipWith, and one generic 

unzip. 



2.2 More General Laws 

Note the free theorem [18] for zip: 

map(cross(f ,g) ) o zip = zip o cross(map f,map g) 
where 

cross (f,g) (x,y) = (f x,g y) 

(Note that “o” is used for function composition; “map f” applies f to each ele- 
ment of a list.) The comparable theorem for zip3 is 

map(cross3(f ,g,h) ) o zip3 = zip3 o cross3(map f,map g.map h) 
where 

cross3 (f,g,h) (x,y,z) = (f x, g y, h z) 

^ Actually, it is their curried counterparts which are defined, but the uncurried version 
is used here for illustrative purposes. 
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To generate these laws is not hard but tedious and error-prone. To formulate 
this family of laws yet another family of functions is needed: cross, cross 3 , 
cross 4 , . . . And note the following laws for 2 -tuples and 3 -tuples^ 

(fst X, snd x) = X 

(fst3 X, snd3 x, thd3 x) = x 

for which one needs another set of families of functions: fst, fst 3 , fst 4 , . . . 
and snd, snd 3 , snd 4 , .... One would wish to generalize over these families 
of laws. Having fewer and more generic laws is very desirable in a program 
transformation system: one has fewer laws to learn, fewer laws to search, and 
more robust program derivations (i.e., program derivations are more likely to 
remain valid when applied to a modified input program). 



2.3 More General Program Derivations 

It is common to have program derivations of the following form: 



fst e 
= el 



Prove the case for the 
‘fst” of the tuple. 



Similarly, snd e = e2 Wave hands. 

Thus, Make a conclusion about 

e = (fst e,snd e) = (el,e2) the tuple as a whole. 

When arguing informally, this works well and of course scales easily to 3 - 
tuples and up. However, in a practical program transformation system this “simi- 
larly” step must be done without “hand waving” and hopefully without duplicat- 
ing the derivation. One way to do this is to express the above law in some meta- 
language or meta-logic where one could say something like Vn.Vi < n.P{ffi) 
(using ML syntax for projections where #1 = fst, #2 = snd). 

However, a meta-language is now needed to express program laws. A simpler 
approach to transformation, the schematic approach [ 7 ], avoids the use of a meta- 
language: program laws are of the form “ei = 62 ^ 63 = 64” (61,62,63,64 are 
programs in the language, all free variables are implicitly universally quantified, 
and the premise is optional) ; program derivations are developed by successively 
applying program laws: the law is instantiated, the premise is satisfied, then the 
conclusion is used to replace equals for equals. However, using this approach 
for the above derivation requires one to generate nearly identical derivations for 
the fst and snd cases. Is it possible to avoid this duplication of derivations? 
Note that, in general, the form of (el,e 2 ) is (C[fst] ,C[snd])^ (or can be 

^ Ignoring the complication that these laws are not valid in Haskell, which has lifted 
tuples; these same laws are valid in the Zip Calculus which has unlifted tuples. 

® Where C[e] represents a program context C[] with its holes filled by expression e. 
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transformed into such a form). So, one would like to merge the two similar 
derivations 

f St e = . . . = C [f St] 
snd e = ... = C [snd] 

into a single derivation 

#i e = . . . = C [#i] 

However, this still does not work because the “i” in #i must be a constant and 
cannot be a variable or expression (in order for the program to be type-able). 
But if “i” could be a variable, then simple equational reasoning can be used — as 
in the schematic approach — without the need to add a meta-language. The zip 
calculus allows one to do this. 



The zip calculus is a typed lambda calculus extended with n-tuples and sums. 
In particular, it starts as — though in the form of a Pure Type System 
(PTS) [2,15]. To this is added a construct for n-tuples and then n-sums are 
added (very simply using n-tuples). As the syntax of terms and types was be- 
coming very close (because tuples exist at the type level), the choice of a PTS 
seemed natural: in a PTS, terms, types, and kinds are all written in the same 
syntax. Also, the generality of a PTS makes for fewer typing rules. However, the 
generality of a PTS can make a type system harder to understand: it is difficult 
to know what is a valid term, type, and kind without understanding the type 
checking rules. 

3.1 Syntax and Semantics 

The syntax of the terms of the zip calculus is in Fig. 1. The pseudo syntactic 
classes i, d, and t are used to provide intuition for what is enforced by the type 
system (but not by the syntax). The first five terms in Fig. 1 correspond to F^,;, 
encoded as a PTS (although one needs to see the typing rules in the following 
section to get the full story). In a PTS, terms and types are merged into a single 
syntax. The correspondence between F;^ in the standard formulation and as a 
PTS is as follows: 



3 The Zip Calculus 



standard PTS 



Xx:a. e Ax: a. e 
Aa.e Aa:*. e 



value abstraction 
type abstraction 
function type 
quantification 



a ^ j3 Uv.a. (3 (v not free in /3) 
Va.B Ua:*. B 



So, lambda abstractions are used both for value abstractions and type abstrac- 
tions; n terms are used for the function type and quantification; * represents the 
type of types. (For a more leisurely introduction to Pure Type Systems, see [15].) 



32 



Mark Tullsen 



Xv.t.e 

ei 62 

Uv\t\.t2 



variables 

abstraction 

application 

type of abstractions 

type of types 



(ei, 62 , ...) 
rUn 
nd 
D 



tuple 

projection (1 < m < n ) 
dimension (1 < n ) 
type of dimensions 



+d t sum type 

lud constructors for +d 

cased destructor for +d 



projections (of type nd) 
dimensions (of type D) 
types and kinds (of type * or □) 

m,n {natural numbers} 



i ;:= e 
d ::= e 
t ::= e 



Fig. 1 . Syntax 



To this base are added the following: ( 1 ) Tuples which are no longer restricted 
to the term level but also exist at the type level. ( 2 ) Projection constants (m„ - 
get the TO-th element of an n-tuple), their types (nd - dimensions, where m„:nd; 
“d” here is the literal character), and “D” the type of these nd (“D” has a role 
analogous to *). And ( 3 ) n-sums made via n-tuples: for n-sums (+nd(^i, in)) 
the constructor family, /n„d, is an n-tuple of constructors and the destructor 
case„d takes an n-tuple of functions. 

Since one can write tuples of types, one must distinguish between (ti,i2) (a 
tuple of types, having kind 7 J_: 2 d — > *)"^ and X2d(ii,i2) (a type, i.e., something 
with kind *). 

A 3 -tuple such as (ei, C2, 63) is a function whose range is the set {I3, 23, 33} 
(the projections with type 3 d). To get the second element of a 3 -tuple, one 
applies the 23 projection to it; thus “(61,62,63) 23” reduces to 62. The type of 
the tuple is a “dependent type” (a 77 term): for instance, (61,62,63) has type 
“Ui : 3 d. {El, E2, E3) z” where 6i : Ei. Genericity over tuple length is achieved 
because we can write functions such as “Ad : D.Xi : d.e” in which d can be any 
dimension (Id, 2 d,...). Although tuples are functions, the following syntactic 
sugar is used to syntactically distinguish tuple functions from standard functions: 



4 



The variable 



is used for unused variables. 
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Reduction Rules: 

(Auit.ei) 62 = ei{e2/u} (/3 reduce) 
(ei, e„) = 6i (x reduce) 

cased e {lud-i e') = e.i e' (+ reduce) 

Eta laws: 



Xv.a.ev = e if e :: 


Ua-.A.b, V ^ fv(e) (77 eta) 


(e In, ... , en„) = e if e :: 


Ui'. nd. A 


(x eta) 


cased Irid e = e if e :: 


+d A 


(-1- eta) 


Instantiation: 




h o cased / = cased 




if h strict (inst) 


C[cased ('■'^Au:t. e) x] = cased 


C‘‘Xv.t.C[e])x 


if C[] strict (inst) 



Fig. 2. Laws 



e) = Ai : d . e 
e.i = e i 
Xdt = 7Ti:(i.t i 



Also, in what follows, a ^ b is used as syntactic sugar for II_:a. b; in this case, 
a n type corresponds to a normal function. 

The semantics is given operationally: the three reduction rules of Fig. 2 are 
applied left to right with a leftmost outermost reduction strategy. Translating 
the (/3 reduce) and (77 eta) laws into the above syntactic sugar gives these laws: 

= e{j/i} (n-tuple reduce) 

= e if e :: x^A, z^fv(e) (n-tuple eta) 



To give some intuition regarding the semantics of n-tuples, note this equivalence: 



(f ,g).i (x,y).i) 
((i:2d (f (x,y).i).l2, 

= (x,y).i).22) 

((f,g).l2 (x,y).l2, 

(f,g).22 (x,y).22)) 

(f X, g y) 



{x eta} 



{n-tuple reduce, twice} 



{x reduce, four times} 



The tuples (f ,g) and (x,y) are “zipped” together, thus the name “zip calculus.' 
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r a : r 3 : A B 

TVa~B 

r h A : s 



(con 



(var) 



r,x : A h X : A 

r\- f -.{nx: A.B), r\-a-.A 

r \- f a : B{a/x} 



c : s G A 
h c : s 

r\-b:B r\- A-.S 
r,x: Ah b-.B 



(axiom) 

(weak) 



, , r,x:Ahb:B, B h (Ux : A.B) : t 

("PP) rh{Xx-.A.b):{nx:A.B) 



_T h j 4 : s, r,x ■. Ah B : t, {s,t,u) G TZ , 

r h {Bx : A.B) : u '■P^^ 

Fig. 3. Type Judgments for a Pure Type System 



Vj£{l..w}. r h aj : Aj, B h {IIi:nd. {A\,. . . ,AA) i) \ t . 

Bh {ai, ..., a„) : Bli:nd. (Ti, . . . , A„) i '^P ® 

Fig. 4. Additional Type Judgments for the Zip Calculus 



3.2 The Type System 

The terms of a PTS consist of the first four terms of Fig. 1 (variables, lambda 
abstractions, applications, and II terms) plus a set of constants, C. The specifi- 
cation of a PTS is given by a triple {S,A,TZ) where 5 is a subset of C called the 
sorts, A is a set of axioms of the form “c : s” where c G C, s G S, and 7?, is a set 
of rules of the form (si, s2, s3) where si, s2, s3 G S. The typing judgments for a 
PTS are as in Fig. 3. In a PTS, the definition of =/3 in the judgment (conv) is 
beta-equivalence (alpha-equivalent terms are identified). 

In the zip calculus, the set of sorts is S ={ld, 2d, ...} U {*, □, D}, the set of 
constants is C = 5 U {m„|l < m < n}, and the axioms A and rules TZ are as 
follows: 



A axioms 


TZ 


rules 


* : □ 


(*,*,*) 


XVe - t .e 


m„ : nd 


(□,*,*) 


Xvt :T .e 


nd : D 


(□,□,□) 


Xvt :T .t 


D : □ 


(D,D,*) 


Xvi :d.i 




(D,*,*) 


Xvi :d.e 




(D,n,D) 


Xvi :d .t 



The TZ rules, used in the (pi) rule, indicate what lambda abstractions are 
allowed (which is the same as saying which II terms are well-typed) . Here there 
are six TZ rules which correspond to the six allowed forms of lambda abstraction. 
The expression to the right of each rule is an intuitive representation of the type 
of lambda abstraction which the rule represents (e - terms, t - types, T - kinds, i 
- projections, d - dimensions, Vx - variable in class x). For instance, the (D,D,*) 
rule means that lambda abstractions are allowed of form Xvi : d .i where d : D 
(i.e., d is a dimension such as 3d) and thus Vi represents a projection such as 2^ 
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r h / :^ {nx : A.B), r^a-.A', A=p A' 

fa-. B{a/x} 



(app) 




r, X : A \- b : B , B (Ux : A.B) : t ,, . 

— 7- — -T — 7-7^ . (lam) 

A U\ . TTn. . A T3\ ' ' 



rv {\X-. A.b) : {Bx : A.B) 



C-. s £ A , . 

— (axiom 

h c : s 



r \- A s, 




Fig. 5. Syntax Directed Type Judgments for a Functional PTS 



VJ G r h aj Aj, T h (77i : nd. {Ai, . . . , An) i) : t 

r h (fli, a„) : m-.nd.{Ai,...,An)i 



(tuplel) 




(tuple2) 



r h / :^ C, r a A, C Bx -. A.B 



(app') 



r\- fa: B{a/x} 

r \- a : A, A -»/3i B . 
..T, ('■ed ) 



r \- a B 



Fig. 6. Syntax Directed Type Judgments for the Zip Calculus 

and the body i must have type D , and the type of the type of this whole lambda 
expression is *. 

In the zip calculus there is an additional term, {ei, 62 , ■■■), which cannot be 
treated as a constant in a PTS (ignoring sums for the moment) . The addition of 
this term requires two extensions to the PTS: one, an additional typing judgment 
(Fig. 4) and two, the =/? relation in the (conv) judgment must be extended to 
include not just {(3 reduce) but also (x reduce) and (x eta). 

To get generic sums, one needs only add + as a constant and the following 
two primitives 

In :: J7l:D. 77a: Xi("' ^ *) . Xi(^ ’ ^ a. i^+i a) 

case :: J7l:D. 77a: Xi("’ ^ *) . 77b:*. Xi(^’^a.i^b) —> (+i a— >b) 

where In is a generic injection function: e.g., for the sum + 2 d(a, b) the two injec- 
tion functions are “(In 2d (a, 6 )).l 2 ” and “(In 2d (a, 6 )). 22 ”. 

3.3 Type Checking 

There are numerous properties, such as subject reduction, which are true of Pure 
Type Systems in general [2] . There are also known type checking algorithms for 
certain subclasses of PTSs. Although the zip calculus is not a PTS, it is hoped 
that most results for PTSs will carry over to the “almost PTS” zip calculus. 

A PTS is functional when the relations A and TZ are functions {c : si G A and 
c : S 2 G A imply si = S 2 ; (s, 7, mi) G TZ and (s, 7, U 2 ) G TZ imply u\ = U 2 ). In the 
zip calculus, A and TZ are functions. If a PTS is functional there is an efficient 
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type-checking algorithm as given in Fig. 5 (cf. [15] and [17]), where the type 
judgments of Fig. 3 have been restructured to make them syntax-directed. The 
judgment (red) defines the relation “F h x X” and is beta-reduction. 

This algorithm can be modified as in Fig. 6. The rules (tuplel) and (tuple2) 
replace (tuple) from Fig. 4. The rules (app') and (red') replace the (app) and 
(red) judgments of Fig. 5. Here is extended with (x reduce) and 
is equality up to (x eta) convertibility. The reason for the change of (app) is 
because / may evaluate to 



{IIx : ai-bi, Ux : a„.6„).i 

and application should be valid when, for instance, this is equivalent to a type 
of the form 

IIx:{{ai, ...,an).i) ■ {h, ...,bn).i 

A proof of the soundness and completeness of this algorithm should be similar 
to that in [17]. 

4 Examples 

This section provides several examples of the usefulness of the zip calculus. 
Writing programs in an explicitly typed calculus can be onerous; to alleviate 
this, some syntactic shortcuts are often used in the following: the f ’ is dropped 
in lambdas and in the n-tuple syntactic sugar; m is put for the projection m^; 
the dimension d is dropped from x^; and, when applying dimensions and types, 
“/d.ti.ta” is put for “/dti t 2 ' ■ Also, f X = e is syntactic sugar for / = Xx.e. The 
following conventions are used for variables: t, a, 6, c. A, B, C for types (terms of 
type *); i,j, fc, I for projections (terms of type nd); and d, I, J, AT, L for dimension 
variables (terms of type D). 

4.1 More General Programs 

An imcurried zip3 is as follows in Haskell: 

zip3 :: ( [a] , [b] , [c] ) ^ [(a,b,c)] 

zip3 (a: as ,b:bs , c : cs) = (a,b,c) : zip3 (as,bs,cs) 

zip3 _ = [] 

If Haskell had n-tuples, one could write a generic zip as follows: 

zip :: x{^ [a.i]) ^ [xa] 
zip (^x.iixs.i) = X : zip xs 

zip _ = [] 

Note that patterns are extended with n-tuples. Unfortunately, this function 
cannot be written in the zip calculus (extended with recursive data types and a 
fix point operator) unless a primitive such as seqTupleMaybe is added: 

seqTupleMaybe :: x(^a.i^Maybe b.i) — > X a— + Maybe ( Xb) 
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However, once this primitive is added, one can define n-tuple patterns in 
terms of this primitive. (Using seqTupleMaybe we can translate patterns out of 
the language similarly to that done in [16].) Section 5.1 returns to this problem 
of functions that must be primitives. 

4.2 More General Laws 

The parametricity theorem for an uncurried zip3 

map(cross3 (f ,g,h) ) o zip3 = zip3 o cross3 (map f,map g.map h) 
where 

cross3 (f,g,h) (x,y,z) = (f x, g y, h z) 

can be generalized in the zip calculus to this: 

map(crosS(j f) o zip = zip o crossjj (^"'^map f.i) 
where 

crossjj f X = (^■'^f.i x.i) 

And this law 

(x.ls, x.2s, x.3s) = X 

can be generalized to the (n-tuple eta) law: 

(i^'ix.i) = X 

4.3 More General Derivations 

The original motivation for the zip calculus was to create a language adapted to 
program transformation. This section shows how the zip calculus can simplify 
program transformation. 

A formal derivation of the program derivation sketched in section 2.3 would 
consist of three sub-derivations 



e . I 2 - ... - C [I 2 ] 


( lemma- 1) 


e.22 = ... = C[22l 
e 


(lemma-2) 


(e.l 2 , e. 22 ) 


{x eta} 


{C[l2l ,C[22l) 


{lemma.-! , 



in which the first two (lemma-1, lemma-2) are virtually identical. Using n-tuples 
these two sub-derivations can be merged into a single derivation (lemma-n), 
giving this derivation: 
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e . i = ... = C [i] 



(lemma-n) 



e 

= {n-tuple eta} 

("e.i ) 

= {lemma-a} 

("C[i] ) 

So, without using a meta-language, we have a derivation that is both shorter 
and more generic. Another example is the derivation of a law, called Abides: 

case (Ax.(al,bl) , Ay.{a2,b2)) x 



(case (Ax. al , Ay . a2) x , case (Ax.bl, Ay.b2) x) 
Its derivation is 

case (Ax.{al,bl) , Ay.{a2,b2)) x 



((case (Ax.(al,bl) , Ay.(a2,b2)) x).l , 
(case (Ax.(al,bl) , Ay.(a2,b2)) x).2 ) 

{(case (Ax. (al ,bl).l , Ay . (a2,b2).l) x) , 
(case (Ax . (al ,bl).2 , Ay . (a2,b2).2) x) ) 



(case (Ax.al, Ay.a2) x , 
case (Ax.bl, Ay.b2) x ) 

Here is a generic version of Abides 

case (^ Ay.(J m.i.j y)) x = (I case 

and its derivation is 

case (^ Ay.(J m.i.j y)) x 

(I (case (^ Ay. (J m.i.j y)) x).j) 

(I case (^ Ay. (J m.i.j y).j) x) 



{x eta} 

{iast, twice} 

{x reduce, four times} 



(^ Ay. m.i.j y) x) 



{a-tuple eta} 
{iast} 



= {a-tuple reduce} 

(I case (^Ay. m.i.j y ) x) 

which corresponds directly to the non-generic derivation above. Note that in- 
stantiation is only applied once (not twice) and reduction once (not four times), 
and this law is generic over sums of any length and products of any length. 



4.4 Nested N-Tuples 

Typical informal notations for representing n-tuples are ambiguous: e.g., one 
writes fx for the “vector” {f x\, f Xn) but now g{fx) could signify either 
{gif Xi ), ..., gif Xn)) or g if xi , ..., / a;„). These notations do not extend to nested 
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n-tuples. In the zip calculus one can easily manipulate nested n-tuples (“matri- 
ces”). For example, the application of a function to every element of a three- 
dimensional matrix is coded as follows (note that (-•‘^x) is a tuple of identical 
elements) : 

mapSDmatrixi^ j_K,a,b 

:: (a->b)^x(--^ X a))) ^ X ^ X (^ '^ x(-^b))) 

mapSDmatrixi^ j_K,a,b 

In the definition of map3Dmatrix, the expression (I (^’^ f m.i.j.k))) is a 
3-dimensional matrix where “f m.i.j.k” is the value of the “.i.j.k”-th element, 
which here is “f” applied to the corresponding value of the original matrix 
“m.i.j.k”. Matrix transposition is straightforward: 

transpose! - X ^ X j)) ^ X (I ‘ X ^ a.i. j)) 

transpose! . (J ' ^ x.i. j)) 

The transpose is done by “reversing” the subscripts of x. Note that the type 
variable “a” above is a matrix of types and, for any n, transpose could be 
applied to a tuple of n-tuples. An application of transpose would be reduced 
as follows: 

(transpose 3 (j 2d a((xl >x2) , (yl ,y2) , (zl ,z2))) .2.3 
(J (i ((xl,x2).(yl,y2),(zl,z2)).i.j)).2.3 
((xl.x2),(yl,y2),(zl.z2)).i.2) .3 

((xl.x2),(yl,y2),(zl.z2)).3.2 
^ (zl,z2).2 

^ z2 



Note the various ways one can transform a two dimensional matrix: 



C (•’ m-i-j)) 

(■^ C m-i-j)) 

C (j f m.i.j)) 

C f(j m.i.j)) 

(j f(^ m.i.j)) 



m itself 

the transpose of m 
f applied to each element of m 
f applied to each “row” of m 
f applied to each “column” of m 



Clearly this notation extends to matrices of higher dimensions. Some laws 
about the transpose function are as follows: 



m.i.j = (transpose! ^ j ^ m) .j .i 

m = transposej^!^!|(transpose!^ j^gm) 



Here is a proof of the latter (a proof of the former is part of the derivation): 
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transpose j ^ I ^ b (■transpose j ^ J ^ a 



1 


(k 


(im.i.j)).k.l)) 


{unfold transpose , twice} 


1 




("m.i.k) .1 )) 


{u-tuple reduce} 


1 




m.l.k )) 


{u-tuple reduce} 


1 




m.l ) 


{u-tuple eta} 






m 


{u-tuple eta} 



4.5 Generic Catamorphisms 

It "was obvious that Haskell’s zip family of functions could benefit from n-tuples; 
but interestingly, catamorphisms [10,11,13] can also benefit from n-tuples, giving 
catamorphisms over mutually recursive data structures. 

First, a fix point operator for terms, fix, and a fix point operator at the type 
level, /r, must be added to the calculus. Normally, the kind of ^ is (*—>*)—> ★ 
(i.e., it takes a functor of kind * — > * and returns a type), but here the kind of 
is (x(-’‘^*) — > X > X i.e., it takes a functor transforming d-tuples 

of types and returns a d-tuple of types. (In the follo'wing, the subscript of n is 
dropped "when clear from the context.) The primitives in and out no'w ■work on 
tuples of functions. Note how their types have been extended: 

inp :: F(/iF)— >/iF original 

inj^p :: x(^'^ (F(/iF) ) . i — > (^F) . i) n-tuple 

outp :: /iF— >F(^F) original 

outi^F ■■ x(^'^ (/rF) . i ^ (F(^F) ) . i) n-tuple 

From these a more generic cat a can be defined: 



catap^a. 




(F a^al’^C/iF^a) 


original 


catai^F,a 




x(^'^ (F a) . i^a. i) — > X ^ (^F).i^a.i) 


n-tuple 


catap^a 


<(> 


= fixAf.^o(F f)o outp 


original 


catai^F,a 


<(' 


= f ix Af . (^ ■ ^ (j).i o (F f ) .i o (out j ^p) .i) 


n-tuple 



(Of course, since the definition of cata is polytypic in the first place, this assumes 
that there is some form of polytypism — note the application of the functor F to 
a term.) So, cata^^j^p^g^ takes and returns an n-tuple of functions. All laws (such 
as cata-fusion) can now be generalized. Also, the standard functor laws for a 
functor F of kind * — s- * 



id = F id 

FfoFg = F(fog) 



can be generalized to functors of kind x(-’^*) ^ x(-’“'*): 
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(-J id) =F (-1 id) 

{y-^ (F f).j o (F g).j) =F f.i o g.i ) 

The original cata and functor laws can be derived from these by instantiating 
the n-tuples to 1-tuples and then making use of the isomorphism x (a) w a (the 
bijections being Xx.x.li and \x.{x)). 



5 Conclusion 

5.1 Limitations 

The zip calculus does not give polytypism (nor does polytypism give n-tuples); 
these are orthogonal language extensions: 

— Polytypism: generalizes zipList, zipMaybe, zipTree, . . . 

— N-tuples: generalizes zip, zip3, zip4, . . . 

An n-tuple is similar to a heterogeneous array (or heterogenous finite list); but 
although one can map over n-tuples, zip n-tuples together, and transpose nested 
n-tuples, one cannot induct over n-tuples. So, n-tuples are clearly limited in what 
they can express. As a result, one cannot define the following functions in the 
zip calculus: 

tupleToList(j :: x("‘’^a) ^ list a 

seqTupleL.seqTupleR :: Monad m => x(^a.i— +m b.i) Xa^m(xb) 

However, if we provide seqTupleL and seqTupleR as primitives, 

— Each of these families of Haskell functions can be generalized to one generic 
function: zip . . . , zipWith. . . , unzip. . . , and lif tMl . . . 

~ The function seqTupleMaybe from section 4.1 can be defined. 

~ Several of Haskell’s list functions could also be defined for n-tuples: zip, 
zip3, . . . , zipWith, zipWith3, . . . , unzip, unzip3, . . . , map, sequence, mapM, 
transpose, mapAccumL, mapAccumR. (These functions all act “uniformly” on 
lists — they act on lists without permuting the elements or changing their 
length.) 

Other functions cannot even be given a type in the zip calculus. For instance, 
there is the curry family of functions 

curry2 : : (a->b->c) -> (a,b)->c 

curryS :: (a->b->c->d) -> (a,b,c)->d 



but there is no way to give a type to a generic curry. Extending the zip calculus 
to type this generic curry is an area for future research. 
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5.2 Relation to Other Work 

Polytypic programming [10,11,13] has similar goals to this work (e.g., PolyP [ 8 ] 
and Functorial ML [9]). However, as just noted, the genericity of polytypism 
and n-tuples appear orthogonal. As seen in section 4.5, with both polytypism 
and n-tuples some very generic programs and laws can be written. 

Two approaches that achieve the same genericity as n-tuples are the follow- 
ing: First, one can forgo typed languages and use an untyped language to achieve 
this level of genericity: e.g., in Lisp a list can be used as an n-tuple. Second, a lan- 
guage with dependent types [ 1 ] could encode n-tuples (and much more); though 
the disadvantages are that type checking is undecidable (not to mention the lack 
of type inference) and the types are more complex. However, the zip calculus can 
be viewed as a way to add dependent types to a typed language in a restricted 
way. 

It would be a simple and obvious extension to allow for finite sets other than 
the naturals as projections, e.g., one could have strings as projections and fi- 
nite sets of strings as dimensions. Currently, projections have their dimension 
embedded (e.g., the projection “I 3 ” has dimension (or type) “3d”); to allow for 
projections that are “polymorphic” over dimensions (e.g., projection 1 could be 
applied to a tuple of any size) would take the zip calculus into the realm of ex- 
tensible records [4,14,19]. N-tuples and extensible records seem to be orthogonal 
issues, though further investigation is required. 

Related also is Hoogendijk’s thesis [5] in which he develops a notation for 
n-tuple valued functors for program calculation; his notation is variable free, 
categorical, and heavily overloaded. 



5.3 Summary 

Implementation has not been addressed. One method is to simply inline all n- 
tuples, although this could lead to code explosion and does not support separate 
compilation. Another method is to implement n-tuples as functions (as they are 
just another form of function); just as there is a range of implementation tech- 
niques for polymorphic functions, there are analogous choices for implementing 
functions generic over dimensions. 

Future work is (1) to give a formal proof of various properties of the zip 
calculus, such as subject reduction, ( 2 ) to extend the zip calculus to be polytypic, 
(3) to increase the expressiveness of the zip calculus (so seqTupleL, seqTupleR, 
and tupleToList can be defined in the language and curry could be given a 
type), and (4) to implement a type inference algorithm for the zip calculus. 

I hope to have shown that the genericity provided by n-tuples is useful in a 
programming language and particularly useful in program transformation. Al- 
though there are other solutions, the calculus presented here is a simple solution 
to getting n-tuples in a typed language. One notable benefit of n-tuples in a trans- 
formation system is that they allow one to do many program transformations 
by simple equational reasoning which otherwise would require a meta-language. 
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Abstract. We present some new theorems that equate an iteration to 
a sequential composition of stronger iterations, and use these theorems 
to simplify and generalize a number of known techniques for pretending 
atomicity in concurrent programs. 



1 Introduction 

One way to simplify the analysis of a concurrent program is to pretend that 
certain complex operations execute atomically. For example, one can sometimes 
pretend that messages are received as soon as they are sent, or that servers 
respond immediately to requests. Reduction theorems provide formal justification 
for such pretenses; they typically depend on commutativity arguments to convert 
an arbitrary execution to an equivalent one in which complex operations run 
without interruption. 

Proofs of reduction theorems (as well as many other theorems in concurrency 
control) can be simplified by applying general-purpose separation theorems that 
equate an iteration to a sequential composition of stronger iterations. Separation 
theorems have received little attention, partly because the obvious theorems are 
not strong enough for serious applications, and partly because they are awkward 
to state and prove in formalisms that do not support sequential composition of 
nondeterministically terminating iterations. Nevertheless, the use of separation 
has allowed us to consistently generalize and simplify existing reduction tech- 
niques; the new reduction theorems 

— work in arbitrary contexts (e.g., allowing control to start or end inside of a 
critical section), 

— have weaker hypotheses, (e.g., allowing environment actions to move control 
in and out of critical sections), 

— apply to both terminating and nonterminating executions, and 

— have shorter, simpler proofs. 

2 Omega Algebra 

An omega algebra is an algebraic structure over the operators (in order of in- 
creasing precedence) 0 (nullary), 1 (nullary), -|- (binary infix), • (binary infix, 
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usually written as simple juxtaposition), * (binary infix, same precedence as •), 
* (unary suffix), and “ (unary suffix), satisfying the following axioms^: 



x + y) + z = 


= x+{y + z) 




VI 




X 


+ y = y 






x + y = 


= y + x 
















x + X = 


- X 




X 


= 


1 


+ X + X* 


X 




0 -1- X = 


- X 


X 


y < X 




X 


y* =x 


(* 


ind) 




= {xy) z 


X 


y<y 




x' 


* y = y 


(* 


ind) 


o 

II 

o 

II 


= 0 
















1 X = X 1 = 


= X 




Xk y 


= 




^ +x* y 






X {y + z) = 


= X y -I- X z 




x“ 


= 


X 


x“ 






{x + y) z = 


= X z -I- y z 


x<y 


X -I- z 




X 


<yk z 


(* 


ind) 



(In parsing formulas, • and * associate to the right; e.g., u v -k x -k y parses 
to {u ■ (w“ + V* ■ {x‘^ + X* ■ y))). In proofs, we use the hint “(dist)” to indicate 
application of the distributivity laws, and the hint “ (hyp)” to indicate the use 
of hypotheses.) 

These axioms are sound and complete for the usual equational theory of 
omega-regular expressions. (Completeness holds only for standard terms, where 
the first arguments to •, and * are regular.) Thus, we make free use, without 
proof, of familiar equations from the theory of (omega-)regular languages (e.g., 
X* X* =x*). 

It is easy to see that the axioms not mentioning or * are closed under 
duality, where the dual of a formula is obtained by reversing the arguments 
to •. Moreover, it can be shown that the remaining axioms are a conservative 
extension, so the dual of any theorem not mentioning or * is also a theorem. 
In hints, we indicate the use of the dual of a theorem by priming the hint (as 
in (10)’). 

Many of our theorems treat * iteration and k iteration in a uniform way; to 
facilitate these, the symbol o (binary infix, right associative, same precedence as 
• and *) ranges over the functions * and (Ax, y : x* y). For example the equation 
xoxoy = xoy abbreviates the conjunction x* x* y = x* y A xk xk y = xk y. 



Several theorems of this type are used in our proofs: 

XOy = y + xxoy (1) 

x*xoy = xoy (2) 

xkxoy = xky (3) 

{x + y) o z = {y* x) oy o z (4) 

(x + y) o z = y o z + y* X {x + y) o z (5) 

xoyoz<{x + y)oz (6) 

xoxoy = xoy (7) 

X o y = X* y -I- X o 0 (8) 



^ The axioms not mentioning u) are equivalent to Kozen’s axioms for Kleene algebra[5], 
plus the three axioms for omega terms. 
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y is a complement of x iff x y = 0 and x + y = 1. It is easy to show 
that complements (when they exist) are unique and that complementation is 
an involution; a predicate is an element of the algebra with a complement. In 
this paper, p, q, and Q range over predicates, with complements p, q, and Q. It is 
easy to show that the predicates form a Boolean algebra, with + as disjunction, 
• as conjunction, 0 as false, 1 as true, complementation as negation, and < as 
implication. Common properties of Boolean algebras (e.g., p q = q p) are used 
silently in proofs, as is the fact 



X p y = 0 X y = X p y 

Unlike previous axiomatizations of omega-regular languages, the omega al- 
gebra axioms support several interesting programming models, where (intu- 
itively) 0 is magic, 1 is skip, -|- is chaotic nondeterministic choice, • is sequential 
composition, < is refinement, x* is executed by executing x any finite number 
of times, and x“ is executed by executing x an infinite number of times. (This 
correspondence holds only for standard terms.) The results of this paper are 
largely motivated by the relational model, where terms denote binary relations 
over a state space, 0 is the empty relation, 1 is the identity relation, • is relational 
composition, -|- is union, * is reflexive-transitive closure, < is subset, and x“ re- 
lates an input state s to an output state if there is an infinite sequence of states 
starting with s, with consecutive states related by x. (Thus, x“ relates an input 
state to either all states or none, and x‘^ = 0 iff x is well-founded.) Predicates are 
identified with the set of states in their domain (i.e., the states from which they 
can be executed) . In examples, we use the relational model exclusively; however, 
our theorems are equally applicable to trace-based models, where actions can 
have permanent effects (like output) that cannot be erased by later divergence. 

In using omega algebra to reason about a program, we usually work with two 
terms, one describing the finite executions of the program, the other describing 
the infinite executions. However, when working with flat iterations (as we do 
here), we can merge these terms into a single term using o and a continuation 
variable. For example, to say that the program that executes x for a while is 
equivalent to the program that executes y for a while, we write the equation 
X o z = y o z; the instantiation o =*, z = 1 (i.e., x* = y*) says that the two 
programs have the same finite executions, and the instantiation o = * , z = 0 
(i.e., x“ = y^) says that the two programs have the same infinite executions. 
Whenever possible, we prefer to work directly with o equations, taking care of 
both finite and infinite executions with a single proof. 

3 Iteration 

A standard theorem in program verification says that a predicate preserved 
by the body of a loop is preserved by the loop. The iteration theorem, below, 
generalizes the loop invariance theorem by (1) generalizing the invariant (x) from 
predicates to arbitrary terms; (2) allowing the loop body (y) to be transformed 
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(into z) by the passage of the invariant^; and (3) allowing for exceptional cases 
(w) where the invariant is not preserved: 

X y < z X + w xyou<zo[xu + wyou) (9) 



We prove the * and * cases separately. Here’s the * part: 



X y* u < 

z* {x + w y*) u = 

z* {x u + w y* u) 

xy* < 

z* {x + w y*) y* = 

z* {x + w y*) 

z* {x + w y*) y = 

z* X y + z* w y* y < 

z* {z X + w) + z* w y* y < 
z* x + z* w + z* w y* < 

z* {x + w y*) 

And here’s the * part: 

X y-k u 

X {u + y y-k u) 
xu+xyyku 
X u + {z X + w) yku 
z {x y k u) + {x u + w y k u) 

so xyku<zk(xu + wyk 



{x y* < z* {x + w y*) (below)} 



{(dist) } 

{x < z* {x + w y*) } 

{(proof below); (* ind) } 

{(dist) } 

{x y < z X + w (hyp) } 

{(dist); z* z<z*; y* y < y* } 
{w < w y*; (dist) } 



= {y* u = u + y yk u (1)1 

= {(dist) } 

< {x y < z X + w (hyp) } 

= {(dist) } 



,) by {k ind) 



□ 

The separation theorems of the next section are often used in tandem with 
the following “loop elimination” lemmas that remove extraneous iterations: 

X y = 0 X y o z = X z (10) 

X y o z < {x y = 0 (hyp), so x y < 0 x; (iter)} 

0oxz = {0ou = u } 

X z 



□ 



^ This allows the iteration theorem to subsume forward data refinement and simulation 
(and backward refinement and simulation for finite executions). 
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X y = 0 = 

X o y = 

X* y + xoQ = 

y + X oQ 

□ 

4 Separation 

In this section, we prove some general theorems for rewriting an iteration to a 
sequential composition of stronger iterations. They typically have the form 

. . . {x + y)oz = xoyoz 

(In proofs, we show only (x + y) o z < x o y o z; the reverse inequality follows 

from 6.) We call these “separation theorems” because they allow a heterogeneous 
iteration of x’s and y’s to be partitioned into separate x and y iterations. In 
a programming context, we can think of x and y as transition relations; the 
conclusion says that running x and y concurrently for a while has the same 
effect as running x for a while and then running y for a while. 

y X < x y* {x + y)oz = xoyoz (12) 



xoy = y + xoO ( 11 ) 

{8 } 

{x y = 0 (hyp); (10)'} 



{x + y) o z < 
y* {y* x)oyoz < 
X o y* y o z = 
X o y o z 



{1 <y*] {x + y) o z = {y* x) oy o z (4)} 
{y* {y* x) < X y* (below); (iter) } 

{y* yo z = yo z (2) } 



* * 
y y X 

y* X 

X {y*r 

A 

X y 

□ 

The hypotheses of 12 can be weakened in several ways. The first two varia- 
tions are based on the following lemma: 



= {y* y* =y* ^ } 

< {y X<xy* (hyp); (iter)' } 

= {( 2 /*)* = y* } 



y X < X {x + y)* {x + y)oz<x-kyoz (13) 

{x + y) o z = 

y o z + y* X {x + y) o z < 

y o z + X {{x + y)*)* {x + y) o z = 

( 2 ) 

y o z + X {x + y) o z 
so {x + y)oz<x-kyoz by (* ind) 

□ 



{5 

{y X <x {x + y)* (hyp); (iter)'} 
{{u*)* = u*] u*uoz = uoz } 
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Two separation theorems follow immediately from 13. For * separation, we 
can strengthen the conclusion to an equality, and we can obtain a * separation 
theorem by eliminating infinite executions of x: 

y X < X {x + y)* {x + y)*z = x-ky-kz (14) 

x“ = 0 A y X < X {x + y)* (x + y)* = x* y* (15) 

The hypothesis = 0 of 15 cannot be eliminated. For example, if a; = 
{(0,1), (1,2), (2,2)} and y = {(1, 0), (2, 0)}, then y x = {(1, 1), (2, 1)} < x y x, 
and (2, 1) G y x, but (2, 1) ^ x* y*. 

In the special case of * separation, 12 can be strengthened as follows: 

y x<{x+l) y* (x + y)* = x* y* (16) 



(x + y)* 

{{x + l) + yr 
(x+1)* y* 

X* y* 



< {u* = (u + l)* } 

= |y (a; + 1) < (cc + 1) y* (below); separation (12)} 
= |(x+l)* = a:* } 



y {x+1) 
y x + y 
{x + 1) y* 



= {(dist) 

< {y X <{x+l) y* (hyp); y < {x + 1) y* 



} 

} 



□ 

The hypothesis of 16 cannot be weakened to y x < x* y* . For example, if 
y = 1(0, 1), (1,2)} and x = {(1,2), (2,3)}, then yx = {(0,2), (1,3)} < x* y*, and 
(0,3) £ {x + y)* but (0,3) ^ x* y*. Note also that the hypothesis of 16 do not 
work for * iteration; for example, in a two-state relational model, if a: = {(0, 1)} 
and y = {(1,0)}, then (a;-|- y) *0 = {(0, 0), (0, 1), (1, 0), (1, 1)} but a:*y*0 = {}. 

Combining 14 and the dual of 16 gives another theorem that works for both 
iterators: 



y X < X X* (1 -I- y) (a;-|-y)oz = xoyoz 



(17) 



true 

y X < X X* (1 -I- y) 
y X <x* (l + y) 

{x + y)* z = X* y* z 
{x + y)oz = xoyoz 




{y x<x X* (1 -k y) (hyp) } 
{a; X* <x* } 

{(16)' with x,y:=y,x } 
{y x<x {x + y)* (hyp); (14)} 



□ 

Finally, more complex separation theorems are possible. For example: 
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y X <{x + z) y* A z x = Q {x + y + z)ou = xo{y + z)ou (18) 

{x + y + z) o u < {z <y* z } 

\x + y* z + y) ou = {y {x + y* z) < {x + y* z) y* (below); (12)} 

{x + y* z) o y o u = {z x = 0 (hyp); so {y* z) x = 0; (12) } 

X o [y* z) oy ou = {{y* z) o y o u = (y + z) o u (4) } 

x o (^y + z) ou 

y{x + y*z) = {(dist) } 

y x + y y* z < {y X <{x + z) y* (hyp) } 

{x + z)y* + yy*z < {z < y* z] y y* z < y* z y* ■, (dist) } 

{x + y* z) y* 

□ 



5 Reduction 

The separation theorems of the last section can be viewed as special cases of 
theorems of the form 

. . . {x + y) o z = {px + qy)o(j>x + qy)oz 

where p = I and g = 0. In this section, we prove several such theorems, where 
the seperands are obtained by prefixing or postfixing summands of the original 
iterand with predicates. 

One way to interpret a theorem of the form above is as follows; other inter- 
pretations are given later in the section. Let x and y be programs with critical 
sections, let p mean that y is in a critical section and let q mean that a: is in a 
critical section. The conclusion says that every concurrent execution of x and y 
is equivalent to one that executes in two phases: in the first phase, x executes 
only when y is in a critical section (and vice versa), and in the second phase, x 
executes only when y is not in a critical section (and vice versa) . If the execution 
starts in a, p q state (i.e., one where neither program is in its critical section), 
the first phase vanishes (thanks to loop elimination) and we have a corollary of 
the form 

. . . => pq{x + y)oz = pq(j)x + qy)oz 

which says that we can pretend that critical sections are not interrupted. This 
amounts to pretending that critical sections are executed atomically. 

5.1 Fair Progress Reduction 

Consider a finite class of workers Xi, each of whom, at any time, are classified as 
“fast” (pi) or “slow” (pj). The classification system can depend on the particular 
worker, so one worker might be classified as slow if anybody is waiting for him, 
while another might be classified as slow unless he’s waiting for somebody else. 
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The only restriction we place on the classification is that if it is possible to 
execute a fast worker Xi followed by a slow worker Xj, then (1) xj must already 
be classified as slow, and (2) the same effect can be obtained by executing xj 
and then Xi. 

The following theorem says that, under these conditions, any execution of 
the system is equivalent to one that executes in two phases, where only slow 
workers execute in the first phase and only fast workers execute in the second 
phase: 



(Vi : Pi Xi pj Xj < pj Xj Xi) (+z : Xi)oy = (+i : p^ Xi)o(+i : pi Xi)oy (19) 
Let u = (+i : pi Xi), and let v = (+i : pi Xi); then (+i : Xi) = u + v, and 



V u = 

(+i : Pi Xi (+j : pJ Xj)) = 

(+i,j-.piXipJXj) < 

(+i,j:pJXjXi) = 

(+j : pJ Xj) (+i : Xi) = 

u (u + v) < 

u u* (1 + v) 



{(dist) } 
{(dist) } 
{(hyp) } 
{(dist) } 
{def u, ?;} 
{ } 



so (u + v) o y = u o V o y hy 17 



□ 

For example, let the workers be connected by unbounded FIFO queues (at 
least one between every pair of distinct workers); at each step, a worker can 
receive any number of messages (but at most one from any queue) and send 
any number of messages. (As in all of our examples using queues, we assume 
that the only operations on queues are blocking send and receive operations.) 
Let Pi mean that every outgoing queue of Xi is longer than every incoming queue 
of Xi- The hypotheses of the theorem are then satisfied, because (1) at most one 
worker is fast at any time, and (2) if a worker is fast, all of his outgoing queues 
are nonempty, so pi Xi Pj Xj < Xj Xi (since the data manipulated by the Xj 
action is not modified by the Xi action). 

An important special case is where there are exactly two workers, with a 
single FIFO queue (initially empty) in each direction, and at each step a worker 
either sends a single message or receives a single message. Then, in the first phase, 
the two queue lengths never differ by more than one, and so the phase can be 
parsed as a sequence of “macro steps”, each with one step for each worker. For 
many protocols, only a finite number of states are reachable in the first phase; 
since only one worker gets to execute in the second phase, a number of interesting 
properties (maximum queue length, freedom from deadlock, etc.) are decidable 
for such protocols; see [4]. 
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5.2 One-Way Reduction 

A special case of fair progress reduction occurs when one of the workers {y below) 
is always slow: 

p X p = 0 A p X y < y X {x + y) o z = {p x + y) o {p x) o z (20) 

(19), where i ranges over {0, 1}, xq '■= x, xi := y, po := p, p\ := 0. 

□ 

For example, let y be (the transition relation of) a program with critical 
sections, where p means that control is inside a critical section. The hypotheses 
say that x cannot move control out of the critical section, and that, from inside 
the critical section, x actions commute to the right past y actions. The conclusion 
says that any execution is equivalent to one that executes in two phases: during 
the first phase, x does not execute when control is in a critical section of y (i.e., 
critical sections execute atomically), and during the second phase, x executes 
exclusively, with control remaining in the critical section. 

One-way reduction can be applied in many pipelining situations. For example, 
let X and y be programs that communicate via FIFO queues, and let p hold when 
every queue from a; to y is nonempty. The hypotheses of 20 say that (1) x cannot 
empty a queue from x to y (i.e., messages cannot be unsent), and (2) if every 
queue from a; to y is nonempty, then the result of performing an x step followed 
by a y step can be obtained by performing a y step first^. The conclusion says 
that we can pretend that execution proceeds in two phases; in the first phase, x 
executes only when one of his queues to y is empty, and in the second phase, 
only X gets to execute. In the special case where there is only one queue from x 
to y, the first phase condition says, in effect, that messages from x are received 
as soon as they are sent. 

One shortcoming of fair progress reduction is that the * part of the theorem 
can’t be dualized. On the other hand, we can prove the dual of 20 as follows: 

p X p = 0 A y X p < X y {x + y) o z = {x p) o (y + x p) o z (21) 
(18) with X := X p, y := y + X p, z := X p, u := z, justified as follows: 



(y + xp){xp) = {(dist) } 

yxp + xpxp = {p X p = 0 (hyp) } 

y X p < {y X p<x y (hyp) } 

X y < {x = X p + X p] y < y + X p} 

{x p + xp) {y + xp) 

(xp){xp) = {p X p = 0 (hyp) } 

0 



□ 



This holds only if queues from y to x are unbounded, and y receives at most one 
message per queue per step. 
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In contexts where control starts and ends outside of the critical section, we 
can eliminate the extra x iterations with loop elimination: 

p X p = 0 A y X p < X y p {x + y) o z = p {y + xp) o z (22) 

p X p = 0 A p X y < y X {x + y) op = (p x + y) o (j> + (p x) o 0) (23) 



5.3 Two-Phased Reduction 

Two-phased reductions combine 20 and 21. For this section and the next, we 
assume the following hypotheses: 



qy p = 0 
p X p = 0 

q X q = 0 
y X p < X y 
q X y < y X 

These hypotheses can be read as follows. Think of the critical sections of y as 
operating in two phases, where p (resp. q) means control is in the first (resp. 
second) phase. The first hypothesis says that control cannot pass directly from 
the second phase to the first phase without first leaving the critical section 
(this defines what we mean by a “two-phased” critical section). The next two 
hypotheses say that x cannot move control into the first phase or out of the 
second. The last two hypotheses say that x actions left-commute out of the 
first phase and right-commute out of the second phase. (Note that there are no 
constraints on an action that moves control from the first phase to the second 
phase.) The conclusion says that we can pretend that execution consists of a 
sequence of first phase x actions, followed by a phase where the critical section 
executes atomically, followed by a sequence of second phase x actions: 

{x + y) o z = {x p) o [y -\-q X p) o [q x) o z (24) 



{x + y) o z 


< 


{ 20 


} 


{x p) o {y + X p) o z 

{x p) o [y -\-q X p) o [q x) o z 


< 


{q {xp) y <y {xp) (below); 


(21)} 


q{xp)y 


< 


{qxq = 0 (hyp) 


} 


qx qy 


< 


{qy p=0 (hyp) 


} 


qxyp 


< 


{qx y<y X (hyp) 


} 


y xp 








Again, by assuming control begin 


s/ends outside of the critical section. 



can eliminate the extra x iterations; for example. 



p {x + y) o q = p {q X p + y) o {q + {q x) oQ) 



(25) 
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Two-phased reduction is usually applied to pretend the atomicity of an oper- 
ation that gathers information/resources, possibly executes some global atomic 
action, and then releases information/resources. Here are some examples of such 
operations (ignoring the subtleties of each case): 

— a fragment of a sequential program that contains at most one access to a 
shared variable (the so-called single-assignment rule); 

— a database transaction that that never obtains a lock after releasing a lock [9]; 

— a sequence of semaphore p operations followed by a sequence of semaphore v 
operations [8]; 

— a fragment of a distributed program that receives messages, performs a local 
action, and then sends messages [6]. 



5.4 Lamport and Schneider’s Reduction Theorem 

In traditional program reasoning, the concern is program properties rather than 
program refinement; to apply reduction in this context, we need a way to turn 
properties of the “reduced” program (where critical sections execute atomically) 
into properties of the unreduced program. One way to do this is to restrict 
attention to states where control is outside the critical sections. For example, 
under the hypotheses of two-phased reduction, if Q always holds in the reduced 
program, then p + q + Q always holds in the unreduced program (i.e., Q holds 
whenever control is outside the critical section) . The early reduction theorems of 
Lipton [8] and Doeppner[3] had this structure. However, sometimes we want to 
show that Q holds in all states, even when control is inside the critical section. 

A theorem of this kind was proposed by Lamport and Schneider [7]. In order 
to generalize their theorem to handle total correctness (with slightly stronger 
hypotheses), we write some of the hypotheses using o iteration, with the un- 
derstanding that the same interpretation of o (* or *) is to be used for the 
hypotheses and theorem 26 below. Their hypotheses, suitably generalized, are 
as follows (in addition to the other hypotheses of section 5.3): 

I = Ip 

I [y + qxp)opqQ = {) 

p Q {y pT Q = o 
Q (qy)* y Q = Q 

Qi‘^<Q{q y)*qi^ 

{q x) oO = 0 

These hypotheses can be read as follows: 

— the initialization code I leaves control outside of the first phase; 

~ Q always holds in the reduced program whenever control is outside of the 
critical section; for o = *, we also assume that the reduced program termi- 
nates; 
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— it is impossible to start outside the first phase and run first-phase actions so 
as to falsify Q; 

— it is impossible to run second-phase actions, ending outside the second phase, 
so as to truthify Q; and 

— from any state where control is in the second phase and Q is false, it is 
possible to execute second phase actions so as to bring control outside of the 
second phase^ 

~ For o = -k , X cannot execute forever with control in the second phase. (If 
o =*, the formula is trivially satisfied.) 

The following theorem says that Q always holds in the unreduced program (and 
if o =“, the program terminates): 

I{y + x)*Q = 0 (26) 

Define L = {y + q x p); then 



I {y + x)oQ 


< 


{1 < 1“ 


} 


I {y + x)oQ 


= 


{g 1- = g (g 2/)* g (hyp) 


} 


I {y + x)oQ {q y)* q 1'^ 


< 


{g (« 2 /)* 9 g = 0 (hyp); g < 1 


} 


I {y + x)o {qjY qQl'^ 


< 


{9 2/^2/ + 3:; uou*v = uov 


} 


I {y + x) oq Q 


= 


{I = Ip (hyp); (25) 


} 


I p Lo (q Q 1“ -1- (g x) o 0) 


= 


{(g x) oO = 0 (hyp) 


} 


I p Loq Q 1“ 


< 


{puov<uop[up)ov (below) 


} 


I L op [L p) oq Q 


< 


{Lp=yp 


} 


I Lop [y p) oq Q 


< 


{/ Lo {y p) oO < I LoO = 0 (hyp)} 


I Lop ^p)* q Q 


< 


{p Q {y p)* Q = 0 (hyp) 


} 


I Lop Q{y p)*Jl Q 


< 


{( 2 / t) 9 <_y T < 9 2/ (hyp); (iter)' 


} 


I Lop Qq y* Q 
0 


< 


{/ Lopq Q = 0 (hyp) 


} 


p uo V 


< 


{u < u u* p + u p 


} 


p {u u* p + U p) o V 


< 


{(■u p) {u u* p) < {u u* p) 1; (12) 


} 


p {u u* p) o [u p) o V 


< 


{p {u u* p) < {u u*) p; (iter) 


} 


{u u*) op [u p) o V 


= 


{(u u*) 0 w = u 0 W 


} 



uop [u p) o V 

□ 

5.5 Back’s Theorem 

Back[l, 2] proposed a reduction theorem for total correctness. Instead of breaking 

critical sections of y into phases, his theorem partitions the environment action x 

^ There are two ways to talk about possibility (i.e., relational totality) in omega alge- 
bras. The one used here is that x is total on a domain p iff p a; 1" = p 1" (i.e. all 
effects of X can be later obliterated). This works fine for the relational model, and 
fits in well with proof checking mechanisms. An alternative, which works for other 
programming models, is that x is total on p iff, for every y,ypx = 0 => y p — 0. 
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into two components, r and 1; inside a critical section, r (resp. 1) actions can 
always be pushed to the right (resp. left). Suitably generalized, his theorem uses 
the following hypotheses: 



r p y < y r 
y pl<ly 
r p I < I r 
p r p = 0 
p I p = 0 



The first three conditions say that, inside of the critical section, r actions com- 
mute to the right of I and y actions, and I actions commute to the left of r and y 
actions. The last two conditions say that r (resp. 1) cannot move control out of 
(resp. in to) the critical section. The theorem says that we can pretend (in the 
main phase) that critical sections execute atomically: 

{y + r + 1) o z = {p 1) o [y + p I + r p) o (r p) o z (27) 



{y + r + 1) o z 


= 


{1 = p 1 + p 1 


} 


{pl + y + r + pl)oz 


= 


{ 18 with X := p 1, y 


= y + r,} 






{z := p 1, proofs (a),(b) below} 


{p 1) o (y + r + p 1) o z 


= 


{r = r p + r p 


} 


{pl)o(y + rp + pl + rp)oz 
{p 1) o (y + r p + p 1) o (r p) o z 




{proof (c) below, (17) 


} 


(a): 

(y + r + pl) (pi) 




{(dist) 


} 


y p 1 + r p 1 + p 1 p 1 


= 


{pi p = 0 (hyp) 


} 


y p 1 + r p 1 


= 


{y p 1 <l y, r pi <l 


r (hyp)} 


1 y + 1 r 

{p l+pl) (y + r) 


< 


{1 = p 1 + p 1; (dist) 


} 


(b): 

(p 1) {p 1) 
0 


= 


{pi p = 0 (hyp) 


} 


(c): 

{r p) {y + r p + p 1) 




{(dist) 


} 


rpy+rprp+rppl 


= 


{prp = 0 (hyp); pp~- 


= 0 } 


r py 


< 


{r py <y r (hyp) 


} 


y r 


< 


{r = r p + r p 


} 



{y + r p + p 1) {{r p) + {y + r p + p 1)) 

□ 

Again, assuming that r p actions terminate (i.e., r actions cannot leave con- 
trol inside the critical section forever), we obtain a simpler reduction theorem 
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by loop elimination: 

(r p) o 0 = 0 p {y + r + 1) op = p {y + p I + r p) op (28) 

A typical application of Back’s theorem is the following asynchronous snap- 
shot algorithm. Let y be a program whose critical section actions record values 
of some subset of the program variables, and let p mean that not all of these 
variables are recorded. (The critical section is entered by discarding all recorded 
values). Let r (resp. 1) be a sum of transitions that read and write variables in an 
arbitrary way, where each of these transitions can execute only if all (resp. none) 
of the variables it accesses are recorded. The hypotheses of Back’s theorem then 
hold (r and I do not change the set of recorded variables, and so don’t move 
control in or out of the critical section; each summand of r commutes with each 
summand of I because they must access disjoint sets of variables, etc.). Thus, we 
can pretend that the entire state is recorded without interruption. 

6 Conclusion 

In summary, separation theorems are powerful tools for reasoning about con- 
currency control. Our experience with reduction resulted in theorems that were 
uniformly simpler, more powerful, and easier to prove than the originals. These 
theorems also show the advantages of working with simple iteration constructs 
(as opposed to the usual do or while loops), and of retaining sequential com- 
position as a first-class operator. 
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Abstract. It is common for a real-time process to consist of a nonter- 
minating loop monitoring an input and controlling an output. Hence a 
real-time program development method needs to support nonterminat- 
ing loops. Earlier work on real-time program development has produced 
a real-time refinement calculus that makes use of a novel deadline com- 
mand which allows timing constraints to be embedded in real-time pro- 
grams. The addition of the deadline command to the real-time program- 
ming language gives the significant advantage of providing a real-time 
programming language that is machine independent. This allows a more 
abstract approach to the program development process. 

In this paper we add possibly nonterminating loops to the refinement cal- 
culus. First we examine the semantics of possibly nonterminating loops, 
and use them to reason directly about a simple example. Then we develop 
simpler refinement rules that make use of a loop invariant. 



1 Introduction 

Our overall goal is to provide a method for the formal development of real-time 
programs. One problem with real-time programs is that the timing characteris- 
tics of a program are not known until it is compiled for a particular machine, 
whereas we would prefer a machine independent program development method. 
The approach we have taken is to extend a real-time programming language with 
a deadline command [2] that allows timing constraints to be incorporated into a 
real-time program. The result is a machine-independent real-time programming 
language. This allows a more abstract approach to the program development 
process. Of course the compiled version of the program needs to be checked to 
make sure that it meets all the deadlines specified in the extended language 
when the program is run on a particular machine. We consider such checking to 
be part of an extended compilation phase for the program, rather than part of 
the program development phase. Unfortunately, there is the possibility that the 
compiled code may not meet all the deadlines. In this case the program is not 
suitable and either we need to redevelop (parts of) the program, or alternatively 
find a faster machine or a compiler that generates better code. 

To date we have developed a sequential real-time refinement calculus [6,7] 
that can be viewed as an extension [5] of the standard sequential refinement 
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calculus [15]. The rules we have previously used for introducing loops only con- 
sider terminating loops [7], and include a novel rule that uses a timing deadline 
(rather than a loop variant) to show termination [8]. However, real-time control 
applications can require that a relationship is maintained between inputs and 
outputs over all time (not just between the initial value of variables and the final 
value of variables) . Hence we need to be able to develop loops that may not ter- 
minate. In this paper we examine the problem of incorporating nonterminating 
loops into our refinement calculus, and develop both a direct method using the 
loop semantics, as well as a method closer to the usual loop invariant approach. 

1.1 Related Work 

We concentrate our comparison with related work on approaches that develop 
real-time programs from abstract specifications. The two approaches we con- 
sider are Scholefield’s Temporal Agent Model (TAM) [16,17], and Hooman’s 
assertional specification and verification of real-time programs [12]. 

All the methods introduce some form of time variable that allows reference to 
the start and finish times of commands, and that can be used to specify timing 
constraints and relationships. The two main features that distinguish our work 
are the addition of the deadline command, and the use of timed traces for inputs, 
outputs and local variables. 

TAM provides a real-time refinement calculus. If we compare the sequential 
aspects of TAM with our approach, the main difference is in the treatment of 
deadlines. In TAM, deadlines are specified as a constraint on the execution time 
allowed for a command. This restricts deadlines to the command structure of 
the TAM language. In comparison, we express deadlines via a separate deadline 
command. This allows more flexibility in specifying timing constraints. In addi- 
tion to being able to specify constraints that match the structure of the language, 
one can also specify constraints on execution paths that cross the boundaries of 
language constructs, e.g., a path that begins before an alternation command 
and ends within one branch of the alternation, or a path from a point within the 
body of a loop back around to a point within the body of the loop on its next 
iteration. 

A consequence of the TAM approach to deadlines is that it is necessary to 
specify a constant representing the evaluation time for all guards of an alterna- 
tion, even though in practice the guard evaluation time is different for different 
branches of an alternation. In our approach there is no need for such a constant: 
guard evaluation is just considered to be part of the execution time of each path 
on which the guard appears. The real constraints are the overall constraint on 
each path. There is no necessity to have an additional, more restrictive, con- 
straint on the guard evaluation. 

Another difference is in the treatment of inputs and outputs. TAM provides 
shunts for communication between processes and for communication with the 
environment. Our approach treats inputs and outputs as traces over time. One 
of the main application areas we see for our work is in the specification and 
refinement of systems with continuous variables, such as the level of water in a 
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mine shaft [13]. The level of water in the mine shaft is a continuous real- valued 
function of time (it is a timed trace). A program monitoring the water level will 
sample the timed trace, but in order to provide an abstract specification of the 
operation of the program it is necessary to refer to such timed traces. Within this 
paper we define a simple read command, that samples an input, but because we 
use timed traces, more complex input commands, such as an analog-to-digital 
conversion, can be handled within the same framework. 

Hooman’s work [12] on assertional specification and verification extends the 
use of traditional Hoare logic triples to real-time systems. The real-time interface 
of the program with the environment can be specified using primitives denoting 
the timing of observable events. The interpretation of triples has been adapted 
to require the postcondition to hold for terminating and non-terminating com- 
putations in order to handle non-terminating loops. 

Each of the atomic commands in Hooman’s language has an associated con- 
stant representing its execution time, and compound commands, such as if- 
then-else, have constants representing such things as the time to evaluate the 
guard. For example, Hooman [12, page 126] introduces a constant, Ta, repre- 
senting the execution time of an assignment command. An obvious problem is 
that not all assignments take the same amount of time, but further, given a 
single assignment command, its execution time may vary greatly (due to the 
effects of pipelines or caches) depending upon the preceding commands in the 
path. Timing constraints on program components must be broken down into 
timing constraints on the individual commands. The overall approach is similar 
to that used in TAM and suffers in the same ways in comparison to the use of 
a deadline command to specify timing constraints on paths. 

The seemingly small addition of the deadline command in our work has 
had a significant impact on the whole development method, and importantly, 
has allowed developments to treat real-time constraints in a more realistic and 
practical manner than in the other approaches. 

~ During the refinement process, deadline commands can be used to separate 
out timing constraints, leaving behind a requirement to be met that does 
not explicitly contain timing constraints. The standard refinement calculus 
can be used to develop such components. 

— The timing constraints are on execution paths through the program and 
are not necessarily constrained to the phrase structure of the programming 
language. This allows more realistic timing constraints to be devised. 

— A timing constraint is on a whole execution path rather than each command 
in the path, and hence is less restrictive in terms of allowable implementa- 
tions. 

Section 2 outlines the real-time refinement calculus and specifies a simple 
alarm detection example. Section 3 defines a possibly nonterminating loop con- 
struct and applies the semantics directly to the alarm example. Section 4 de- 
velops refinement laws for introducing possibly nonterminating loops; the laws 
make use of loop invariants to simplify the reasoning. 
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2 Real-Time Refinement Calcnlus 



We model time by nonnegative real numbers: 

Time = {r : real | 0 < r < oo} , 

where real numbers include infinity and allow operations such as comparisons 
with infinity. The real-time refinement calculus makes use of a special real- valued 
variable, r, for the current time. To allow for nonterminating programs, we allow 
T to take on the value infinity (oo). 

TimCoo = Time U {oo} 

In real-time programs we distinguish three kinds of variables: 

— inputs, which are under the control of the environment of the program, 

— outputs, which are under the control of the program, and 

— local variables, which are under the control of the program, but unlike out- 
puts are not externally visible. 

All variables are modelled as total functions from Time to the declared type 
of the variable. We use the term program variables to refer to those variables 
under the control of the program, i.e., outputs and local variables. Note that it 
is not meaningful to talk about the value of a variable at time infinity. Only the 
(special) current time variable, r, may take on the value infinity. We sometimes 
need to refer to the set of all variables in scope. We call this p, and the sub- 
set containing just the program variables (outputs and local variables, but not 
inputs) we call p. 

The semantics of the specification command follows an approach similar to 
that of Utting and Fidge [18]. In this paper we represent the semantics of a 
command by a predicate in a form similar to that of Hehner [10,9]. The predi- 
cate relates the start time of a command, tq, to its finish time, t, (which may 
be infinity) and constrains the traces of the variables over time [18]. All our 
commands insist that time does not go backwards: tq < r. 

The meaning function M. takes the variables in scope, p, and a command C 
and returns the corresponding predicate, M.p (C). As for Hehner, refinement of 
commands is defined as reverse implication: 

C\ZD = Mp{C)^Mp{D) , 

where the reverse implication holds for all possible values of the variables, in- 
cluding To and T. 

Aside: earlier semantics [18,7] was based on predicate transformers rather 
than predicates. In that semantics a nonterminating loop is equivalent to abort, 
as in Dijkstra’s calculus [1]. Hence that semantics is not suitable for this paper. 
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2.1 Real-Time Specification Command 

We define a possibly nonterminating real-time specification command with a 
syntax similar to that of Morgan [15]: 

oox: [P, Q] , 

where a; is a vector of variables called the frame, P is the assumption made by 
the specification, and Q is its effect. The ‘oo’ at the beginning is just part of the 
syntax; it indicates that the command might not terminate. 

P is assumed to hold at the start time of the command. Although variables 
are modelled as functions of time, we allow the assumption to refer to variables 
without using an explicit time index, with the understanding that these stand 
for the values of the variables at the start time, tq. For a predicate, P, we use 
the notation P @ s to stand for P with all free occurrences of r replaced by s, 
and all unindexed occurrences of a variable, v, replaced by v{s). 

The effect Q is a relation between the start time tq and the finish time r, as 
well as a constraint on the program variable traces. We use the notation Q@{s,t) 
to stand for the predicate Q with all free occurrences of tq and r replaced by s 
and t, respectively, and all unindexed occurrences of a variable, v, replaced by 
v{t), and all occurrences of initial variables, vq, replaced by v(s). 

The frame, x, of a specification command lists those program variables that 
may be modified by the command. All other program variables in scope, i.e., 
in p but not x, are defined to be stable for the duration of the command. The 
predicate stable{v, S) states that the variable v has the same value over all the 
times in the set S: 

stable{v, = S' {} (3 a; • ?;(|SD = {z}) , 

where r^dS]) is the image of the set S through the function v. We allow the first 
argument of stable to be a vector of variables, in which case all variables in the 
vector are stable. The notation fs ... stands for the closed interval of times 
from s to t, and (-s ... f-) stands for the open interval. We also allow half-open and 
half-closed intervals. The notation p \ x stands for the set of program variables 
(p) minus the set of variables in x. 

Definition 1 (real-time specification). Given variables, p, a frame, x, con- 
tained in p, a predicate, P , involving r and the variables in p, and a predicate, Q , 
involving tq, t, variables in p, and initial variables (zero-subscripted variables) 
corresponding to those in p, the meaning of a possibly nonterminating real-time 
specification command is defined by the following. 

Mp [oox: [P, Q]) = To <T A 

(tq < oo A P @ Tq ^ Q @ (tq,t) A stable{p \ x, [-ro ... r-])) 

Note that if assumption P does not hold initially the command still guarantees 
that time does not go backwards. Because the time variable may take on the 
value infinity, the above specification command allows nontermination. Note that 
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if r = (X) the predicate Q may not depend on the value of variables at time r, 
because variables are only defined over Time, and not Timtoo- 

As abbreviations, if the assumption, P, is omitted, then it is taken to be 
true, and if the frame is empty the is omitted. 

All of the executable commands (Sect. 2.2) in our language only constrain 
the values of the program variables over the time interval over which they exe- 
cute. Typically the effect of a specification command only constrains the values 
of variables over the execution interval of the command: (-tq ... However, 
we do not put any such restriction in the definition of a specification command 
because, although the effect may constrain the value of variables before tq or 
after r, the assumption of the specification may be strong enough to allow the 
effect to be replaced by one that only constrains the variables over the execution 
interval of the command. Such ‘replacement’ steps are part of the refinement 
process. For example, if the effect constrains the value of the variables before tq, 
then in order for the specification to be implementable, the assumption should 
have at least as strong a constraint on the variables before tq, in which case 
the effect can be replaced by one that does not constrain the variables before 
Tq. It is also possible for the effect to constrain the value of the variables after 
T. For example, for a central heater controller, the effect of a specification may 
require the temperature to be above some lower limit mintemp for some time 
interval after r. This is implementable provided the assumption of the specifica- 
tion implies that the rate of change of the temperature over time is limited. The 
specification can be implemented by ensuring the temperature is above mintemp 
by a large enough margin to ensure the temperature remains over mintemp over 
the required interval assuming the maximum rate of fall of the temperature. 



2.2 Real-Time Commands 

Other real-time commands can be defined in terms of equivalent specification 
commands. Here we define: a terminating (no ‘oo’ prefix) specification command, 
x-.[P, Q]; an assumption, {F}; the null command, skip, that does nothing 

and takes no time; a command, idle, that does nothing but may take time; an 
absolute delay command; a multiple assignment; a command, read, to sample a 
value from an external input; the deadline command; the most nondeterministic 
command, chaos; and the best of all (but unimplementable) command, magic. 
External outputs may be modified using assignments. The expressions used in 
the commands are assumed to be idle-stable, that is, their value does not change 
over time provided all the program variables are stable. 

Definition 2 (idle-stable). Given variables, p, an expression E is idle-stable 
provided 

To < T A stable{p, fro ... r^) ^E@tq = E@t . 

In practice this usually means that an idle-stable expression cannot refer to the 
special time variable, r, or to the value of external inputs. 
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Definition 3. Given an idle-stable, time-valued expression D; a vector of pro- 
gram variables, x; a vector of idle-stable expressions, E, of the same length as x 
and assignment compatible with x; a program variable, v; and an input i that is 
assignment compatible with v, the real-time commands are defined as follows. 



skip = [to = t] 
idle = [tq < r] 
delay until £) = < r] 

X '.= E= x : [a; @ T = @ tq] 

V : read(j) = w: [w @ t e j(|[-to ... t^D] 
deadline D = [tq = r < D] 



chaos = \ falsej 

magic = [/afee] 

The deadline command is novel. It takes no time and guarantees to complete 
by the given deadline. It is not possible to implement a deadline command by 
generating code. Instead we need to check that the code generated for a program 
that contains a deadline command will always reach the deadline command by 
its deadline [4]. We discuss this further with the example below. 

The meaning of chaos is the predicate tq < r. It only guarantees that time 
increases and nothing more, not even stability of any program variable. The 
meaning of magic is tq < r A tq = oo. If it begins execution at a time other 
than infinity this reduces to false and hence it refines any other command. Given 
the definition of a specification command (and hence all the other commands), if 
a specification begins at time infinity its meaning is tq < t. Hence all commands 
are refined by magic in this case also. 

2.3 Detecting an Alarm Signal 

We consider a very simple example of possibly nonterminating behaviour to 
illustrate our approach. There is a Boolean alarm signal that indicates a fault 
has occurred. 

input alarm : Boolean 

The input alarm is modelled as a function from Time to its type. Boolean. Hence 
we can use expressions like alarm(t), where t is of type time, to determine the 
value of alarm at time t. 

We assume that, if the alarm is raised, it stays raised for at least 2p time 
units, where p is a constant. As this is a global assumption that is indepen- 
dent of the current time, we state it once and may assume it at any point 
in the program development. The notation alarm over f & ... e-^ stands for 
V t f & ... e-^ • alarm(t). 



x: [P, <5] = oox: [P , <5 A t < oo] 



^ = ^o] 




( 1 ) 
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The problem is to detect an alarm signal and terminate. If the alarm never 
occurs the program will not terminate. Because the concern of this paper is in 
developing nonterminating loops, we give the specification of just the loop, and 
assume appropriate initialisation has already taken place. The loop samples the 
value of the alarm with a sample period of p. It makes use of two local variables: 
the Boolean- valued variable al is used to hold the sample of the alarm on each 
iteration, and the time- valued variable s maintains the start time of the sampling 
period. (The type of s is the time type provided by the target language; that type 
is a subset of Time.) We assume al is initialised to false, and s is initialised to a 
time close to the start time of the program. There are two possible outcomes of 
executing the program: either the alarm is raised and the program terminates, 
or the alarm is never raised and the program never terminates. The notation 
alarm in f s — p ... stands for 3t : f s — p ... r-] • alarm{t). 



oos, al: 



s— p<t<sA 
al 



{t < oo A al A 

alarm in f s — p ... A 

alarm) over [-so + p ... s — 2p-]) 

V 

(r = oo A alarm) over fso + P c»^) 



( 2 ) 



The alarm detection example can be refined to a loop that samples the alarm 
once every p seconds (see Fig. 1). We give the code for the implementation first in 
order to explain the commands available in the real-time programming language, 
and so that the reader is aware of the final form of the program. 

{s— p<r<sA^aZ}; 

do ^ al ^ 

delay until s; 

al : read (aZarm); (3) 

deadline s -I- p; 
s := s -I- p 

od 



On each iteration the loop body delays until the start of the sample period, s; 
samples the value of the alarm using a read command; must reach the deadline 
by the end of the period, s +p; and increments s by the length of the period, p, 
to get the start of the next sample interval. If the value of the alarm when it 
was read was true, the loop terminates. The above loop guarantees that in any 
interval of length 2p the alarm is sampled at least once. The length is 2p rather 
than p because one sample may occur at the very start of its sample period and 
the next sample may occur at the very end of the next sample period, 2p time 
units later. 

In reasoning about the timing behaviour of the above real-time program we 
do not need to separately consider the execution times of the guard evaluation, 
delay, read, assignment and loop iteration. Instead we reason about the time 
taken for execution paths ending in deadlines. In this case there are two paths 
to consider. The first enters the loop and reaches the deadline for the first time. 
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• Sample point of read command 
Alarm signal true 

Alarm could be true but undetected 





So So + p 




Fig. 1. Sampling the alarm signal with period p 



This path starts before time s and must finish by s+p. Hence it has an execution 
time constraint of p. The second path starts at the deadline command, executes 
the assignment, loops back to the guard evaluation, delays and then reads the 
alarm; it has a time constraint of p as well. In calculating the constraint for 
the second path, although the deadline expression (s + p) at the start of the 
path is same as the deadline expression (s + p) at the end of the path, in the 
meantime s has been incremented by p, and that leads to a difference of p in 
the actual deadline times. 

Aside: If the loop is followed by a deadline command then there will be a 
third path starting at the deadline command within the loop, executing the 
assignment, branching back to evaluate the loop guard and exiting the loop; 
such a path is treated similarly to the other two. 

Expressing timing constraints on execution paths is both simpler and more 
general than having constraints on individual commands and components of 
commands. In addition, on architectures with pipelines and caches one can get 
better (lower) worst-case execution time bounds on whole paths, rather than 
individual commands, because the paths give additional context information 
that can be used to determine better worst-case execution time bounds [14]. 



2.4 Sequential Composition 



Because we allow nonterminating commands, we need to be careful with our 
definition of sequential composition. If the first command of the sequential com- 
position does not terminate, then the effect of the sequential composition must 
be the same as the effect of the first command. Otherwise it is the combined 
effect of the execution of the first followed by the second. Here we provide a 
definition of sequential composition in terms of the effects of the two commands. 
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Definition 4 (sequential composition). Given variables, p, and real-time 
commands, C and D , their sequential composition is defined by the following. 

Mp{C; D) = 3t' : Timcoo • Mp(C) @ (to,t') A 

(r' = T = (X) V Mp (D) @ (t'j r)) 

3 Definition of a Real-Time Loop Command 

A real-time loop is similar to a conventional sequential programming loop, ex- 
cept that we take into account timing properties. To give the reader an idea 
of the differences between a real-time loop and a standard loop, we give the 
characteristic recurrences of both. A standard loop, 

SDO = doB — !■ C od , 

satisfies the recurrence 

SDO = if 5 ^ C; SDO B ^ skipfi . 

In the standard calculus, this can be rewritten in the following form, 

SDO = {[B];C;SDOj B]) , 

in which guarded commands of the form B S are rewritten in the equivalent 
form of a coercion followed by the command ( [i?] ; S); the if-fi is replaced by a 
demonic choice (|) because the guards are complementary; and 5] ; skip is 
replaced by its equivalent, B] . 

For the real-time loop command, 

DO A do 5 — > C od , 

there must exist a strictly positive time d, such that the following recurrence 
holds. 

DO = |[ con V : Time • {f = r} ; [5] ; C; delay until v d ]|; DO 
1 

The logical constant v captures the start time of a single iteration. The coercion 
[B] allows the first alternative to be executed if the guard evaluates to true. Note 
that in the real-time case the guard evaluation may take time: the coercion, [i?] , 
may take time and must terminate. The delay until (absolute) time v -\- d At the 
end of an iteration ensures that each iteration of the loop takes a minimum 
time, d. This rules out Zeno-like behaviour in which, for example, each iteration 
takes half the time of the previous iteration. The value of d can be arbitrarily 
small (e.g., 1 attosecond), but it must be greater than zero. (A loop of the form 
do true —*■... od typically has the minimum overhead; its implementation may 
take no time to evaluate the guard, but there will be a minimum time overhead 
for the branch back to the start of the loop.) 
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The Boolean expression B is assumed to be idle-stable. That is, its value 
does not change with just the passage of time if the program variables are sta- 
ble. In practice this means B cannot refer to the current time variable, r, or 
to external inputs (which may change over time). We assume the guard evalu- 
ation terminates, but we place no explicit upper bound on the time taken for 
guard evaluation, because guard expressions may be arbitrarily complex. For a 
particular application there will be a time bound on guard evaluation, but this 
is catered for by using explicit deadline commands within the body of the loop. 
There is no need for a separate upper bound on the guard evaluation time in 
the definition of the loop. 

After completing the command, C, in the body of the loop, it iterates back 
to the guard evaluation. The delay until v + d ‘a,t the end of the iteration not 
only ensures the minimum execution time for each iteration, it also allows for the 
time taken for the loop to branch back to the guard evaluation, because there is 
no explicit upper limit on the termination time of a delay command. 

The exit alternative of the loop, B] , allows for the time taken to evaluate 
the guard (to false) and exit the loop, including the case when the guard of the 
loop is false initially. We place no explicit time bounds on this command in the 
loop definition, but for a particular application the code following the loop may 
include deadline commands, which explicitly introduce a time constraint. There 
is no lower time bound on the exit alternative because the loop do false od 

can be implemented by skip, which takes no time. 

In order to define the behaviour of a loop, we introduce an abbreviation to 
stand for the effect of one iteration of the loop. 

ITERATION = 

Mp (|[ con V : Time • {w = r} ; [5] ; (7; delay until -|- d ]|) 

If we expand out this definition using the meanings of the component commands 
we get the following equivalent predicate, in which ti and T 2 are the start and 
finish times, respectively, of the command C . 

ITERATION = B@to + d <T A 
(3 Ti , T 2 : Time • tq < ti < T 2 < r A 

stable{p, fro ... U fr 2 ... r^) A Mp (C) @ (ti,T 2 )) 

To define the behaviour of a loop we introduce a possibly infinite, increasing 
sequence of times, to < ti < . . ., such that tj represents the start time of itera- 
tion j (starting the numbering from zero) and fj+i represents the finish time of 
iteration j. For the iteration starting at tj, tj corresponds to tq in the predicate 
ITERATION , and fj+i corresponds to r. We develop the behaviour of the loop 
via three cases. 

Termination. If the loop terminates after i iterations, then the termination 

time of the whole loop is some time after ti (but less than infinity), the 

loop guard is false on termination, and for all previous iterations, j , where 
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0 < j < i, an iteration was executed at time tj and it terminated. 

TERM = 

3i : N»ti<T<ooA^B@TA stable{p, ... r-]) A 

{Wj :Nmj <i^ {tj+i <ooA ITERATION @ (4) 

In the above, U is the finish time of the last iteration, and the later time r is 
the finish time of the whole loop. During the interval from ti until r, all the 
program variables are stable, and hence, because B is idle-stable, B at U 
is equivalent to ^ 5 at r. Note the case when i is zero, which corresponds 
to the loop guard being initially false. 

Infinite iteration. If the loop body terminates on every iteration of the loop, 
but the loop guard never becomes false, then the loop iterates forever. In this 
case the ‘termination’ time, r, is infinity and there is an infinite sequence of 
iterations of the loop, such that the loop guard is true at the beginning of 
each iteration, and the body is executed (and terminates) on each iteration. 

FOREVER = 

T = oo A (Vj : M • tj+i < oo A ITERATION @ {tj, tj+i)) (5) 

Note that the definition of ITERATION includes the guard being true at 
the start of the iteration. 

Non-termination of the loop body. The final case is if the loop does not 
terminate because the body of the loop does not terminate. In this case 
there is a finite sequence of starting times of iterations, . . . ,U. The 
body is executed at all the times from to up to ti, but the last execution 
of C does not terminate. The last iteration begins execution at ti, the guard 
evaluates to true, and the loop body, C, begins execution at some later time 
Ti and never terminates. 

BODY_NONTERM = 

3i : N • B @ ti A 

(3ti : Time • U < ti A stable{p, ... ri-]) A 
Aip (C) @ (ti,t) A T = oo) A 

{Vj :Nmj <i^ {tj+i < oo A ITERATION @ {t,, t,+i))) (6) 

We can combine all the above into the following single definition. 

Definition 5 (repetition). Given an idle-stable Boolean expression, B, and a 
command, C , a real-time repetition command is defined by the following. 

Aip (do B ^ C od) = (3 d : Time; t : N — > Time •0<dAto = ToA 

{TERM V FOREVER V BODY _NONTERM)) 

Note that there is just one choice made for d, and that value is used for all 
iterations of the loop. That rules out, for example, successive iterations of a 
loop choosing progressively smaller values of d, and hence it rules out Zeno-like 
behaviour. A particular implementation of a loop will determine a suitable value 
of d. Our implementation-independent approach allows any value. 
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3.1 Alarm Example 

We illustrate the use of the loop definition by showing that the loop for the 
alarm detection implements the specification. In order to do this it is sufficient 
to know that the effect of a single iteration of the loop (3): 

|[ con w : Time] {w = r} ; aZ]; 

delay until s; al : read (aZarm); deadlines + p; s := s + p; 
delay until + <Z ]| , 



implies 

^ al{T(j) A aZ(r) e alarm(\^s{TQ) ... mm{T, s(r)}9D A s(r) = s(tq) +p . 

For the loop body to be executed, the guard of the loop must be true at the start 
time of the loop; hence ^ aZ(ro). The initial delay guarantees that the time at 
which alarm is read is after the initial value of s, s(to), and the deadline ensures 
that the value is read before s(ro) + p. In addition the value is read before the 
termination time of the iteration, r. Together these give that the value is read 
before the minimum of r and s(ro) + p. The assignment, s := s + p, ensures 
s(r) = s(to) + p. Note that the loop body introduces stronger constraints due to 
the final delay command and that we could make more detailed assertions about 
the values of the program’s variables during the execution of the body, but it is 
sufficient that the above is implied by the loop body. 

Using the abbreviation conventions described in Sect. 2.1 this predicate can 
be written 

^ aZo A al G alarml\^so ... min{T, s}9D A s = Sq + P ; (7) 

where the zero-subscripted names stand for the values of the variables at the 
start of the execution of the body and the unsubscripted names stand for the 
values of the variables at the end of the execution of the body. The upper bound 
min{T, s} is required because in reasoning about the loop, if the loop terminates 
after that iteration we need a bound of r, and if the loop does not terminate 
after that iteration we need a bound of s. 

To reason about the loop we consider the three cases examined earlier: ter- 
mination, infinite iteration, and nontermination of the loop body. For the alarm 
example the loop body always terminates, so we can ignore the last case. The 
infinite iteration case is the more novel so we consider that next. To simplify the 
expressions we abbreviate expressions of the form s{tj) to Sj. From the infinite 
iteration case (5) and the body (7), we can deduce the following. 

r = oo A 

(Vj : N • tj+i < oo A ^ alj A 

aZj+i G alarml\^Sj ... min{tj+i, Sj+i}^^ A Sj+i = Sj + p) 

^ as alj is false for all j; [-Sj ... min{tj+i, Sj+i}-] C f ... 

T = oo A 



Reasoning about Non-terminating Loops Using Deadline Commands 



73 



(Vj : [*y • tj+i < oo A false € alarml\^Sj ... Sj+i^D Sj+i = Sj + p) 

^ from the minimum alarm duration assumption (1) 

T = oo A alarm) over fso + P oo-) 

For the last step, recall that (1) states that whenever alarm is true, it is true 
for at least 2p time units. For any value of j greater than zero, one can deduce 
that there exist times fj~i, fj and fj+i, such that alarm is false at each of these 
times and 

Sj - P < fj-i < Sj < fj <Sj+p< fj+i < Sj + 2p . 

Because fj — fj~\ < 2p and alarm is false at both fj-i and fj, alarm must be 
false over the whole interval ^fj-i ■■■ fj^- Similarly alarm must be false over the 
interval \jfj ...fjj^i^. Combining these two results, alarm must be false over the 
interval ^fj-i ... fj+i^. Because this interval includes fsj ... alarm must 

be false over this interval. Finally, because this holds for all values of j greater 
than zero, one can deduce that alarm is false over the interval fso + P ■■■ oo-). 
The start of this interval is sq + P because the first sample of alarm can be as 
late as si = sq + P- 

For the termination case (4), we can deduce the following. 

3i: Nuti<T<ooAal A stable{al, ... t-]) A stable{s, ... r-]) A 
(V j : N • j < i ^ 

(tj^i < oo A ^ alj A alj+i G alarm(\^Sj ... min{tj+i, Sj+i\^\) A 

Sj + i = Sj -bp)) 

^ split out the last iteration (j = j — 1); alj is false for all previous iterations 
3i: Nuti<T<ooAal A stable{al, ... r-]) A stable{s, ... r-]) A 
all G alarml\^Si-i ... A Sj = Si_i + p A 
(V j : M • j < i — 1 =b 

(tj+i < oo A false G alarm(\^Sj ... S;,+i9D Sj+i = Sj + p)) 

^ from the stability of al and s, al = ak and s = Si~, U < t 

3i: Nuti<T<ooAal A alarm in fs — p ... r-] A s = Si_i -b p A 
(V j : N • j < i — 1 =b 

(tj+i < oo A false G alarml\^Sj ... s^+i^D Sj-i-i = Sj + p)) 

^ from the minimum alarm duration assumption (1) 

T < oo A al A alarm in f s — p ... A alarm) over fso + p ... s — 2p^ 

The reasoning for the last step is similar to the reasoning for the infinite iteration 
case, except that in this case the argument that alarm is false over the interval 
f Sj ... Sj+i9 only applies for j such that 1 < j < « — 3, and hence alarm is false 
over the interval fsi ... Si- 2 ^ = fso + P Si-i ~ P^ = fso+b'---s~2p9. 

Together the termination and infinite iteration cases give the desired overall 
effect of the original specification (2). 
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4 Loop Introduction Laws 

The above reasoning was done using the semantics of the loop directly, and 
hence was somewhat complicated. We would like a simpler approach closer to 
the conventional loop invariant approach [3,11]. In that approach, an invariant is 
assumed to hold initially and must be maintained by every iteration of the loop. 
When the loop terminates, as well as the loop guard being false, the invariant 
holds in the final state. However, this approach runs into the problem that, for 
a nonterminating loop, there is no such final state. 

To use a loop invariant to reason about a nonterminating loop, the strategy 
we use is similar to that of Hooman [12, page 129]. There is a sequence of points 
of time corresponding to the starts of execution of the loop body at which the 
invariant is true. If the body of the loop always terminates, then the sequence 
is infinite. In that case, for any time, t, after the start time of the loop, there is 
always some later time, t' , at which both the invariant, /, and the loop guard, B, 
hold. 

T = oo A (V f : Time • {3t' : Time • t < t' /\ {B A I) @ t')) (8) 

If we assume that I holds immediately before the loop starts, we would like to 
assume that both B and I hold at the start of the execution of the command, C, 
within the loop body. However, there is a period of time corresponding to the 
guard evaluation between the two points in the program. Because B is assumed 
to be idle-stable, it will still hold at the start of the execution of C . For the 
invariant, /, we need the condition that, if I holds before evaluation of the guard, 
it will still hold after the evaluation. This is equivalent to I being invariant over 
the execution of an idle command, and we refer to this property as I being 
idle-invariant. 

Definition 6 (idle-invariant). For an environment with variables p, a predi- 
cate P is idle-invariant provided 

To < r A stable{p, ... r^) ^P@tq^P@t . 

The conditions idle-stable and idle-invariant differ in that for the former the 
value does not change over the execution of an idle command, whereas for the 
latter, if the value holds before, then it holds after. The latter differs from the 
former in that for idle-invariant, if the predicate is false beforehand, then it may 
become true during the execution of the idle. 

If the command, C , in the loop body maintains the invariant, then on termi- 
nation of C, I holds, and because / is idle-invariant, it will still hold after the 
delay at the end of the loop body, and hence at the start of the next iteration, 
as required. 

The assumption that I is idle-invariant places restrictions on how I can 
refer to the current time variable, r, because r increases on execution of an idle 
command, and on how I refers to external inputs, because these may change 
over the execution of an idle command. For example, predicates of the form 
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D < T, where D is an idle-stable expression, are idle-invariant, but predicates of 
the form t < D are not. 

If I can be expressed in a form that does not refer to the current time, r, 
and all references to external inputs are explicitly indexed with expressions that 
are idle-stable, then I is idle-invariant. In practice, the link between the current 
time, T, and the invariant, /, is made through a time-valued program variable 
that approximates r, e.g., the variable s in the alarm detection program (3). We 
return to this link later, but first we consider the alarm example. 

4.1 Alarm Example 

The invariant of the alarm detection loop is that al represents the sample of the 
alarm in the most recent iteration (or it is the first iteration and al is false), and 
the alarm sample has been false on all previous iterations, and hence the alarm 
has been false over (at least) the interval fw -I- p ... s — where u represents 
the initial value of s just before the loop begins execution. We need to introduce 
the extra logical constant it, because we use sq to stand for the initial value of s 
on each iteration; u and sq only correspond on the first iteration. A little care 
is required to ensure that the invariant holds initially. That corresponds to the 
situation in which s = u and al is initially false. 

I = {al G alarml\^s — p ... min{T, s}9D V (^ aZ A s = it)) A (9) 

(^ alarm) over + p ... s — 2p^ 

Note that I is idle-invariant because, if al G alarm{S\ for a time interval S, then 
al G alarml\T^ for any time interval T that includes S. 

To show that I is maintained by the loop in the alarm example it is sufficient 
to show /q a (7) ^ /, where /q is I with all occurrences of r and variables in 
the frame replaced by their zero-subscripted forms. 

h A (7) 

= (a/o G alarm(\\-SQ — p ... mm{ro, so}9D V (^ aZq A sq = u)) A 
(^ alarm) over + p ... sq ~ A 
^ alo A al G alarm(\^so ... min{T, s}9D A s = Sq + p 
^ as ^ aZo; E^o - P mm{ro, «o}9 ^ E^o - P ■■■ s = sq + p 
{false G alarm{j[;SQ — p ... sq^D V sq = w) A 
(^ alarm) over + p ... sq — 2p^ A 
al G alarm{j[-s — p ... min{T, A s = Sq + P 

If So = w then ^u + p...s — 2p^ = ^u + p...SQ — p^ is empty, and hence 
(^ alarm) over Eit + p ... s — 2p^ holds. Otherwise, because 

(^ alarm) over + p ... Sq — 2p^ A (^ alarm) in Esq — p ... Sq^ i 

from the minimum alarm duration property (1) one can deduce (^ alarm) over 
+ p ... Sq — p^ = + p ... s — 2p^. Hence we can deduce the following. 

al G alarm(\^s — p ... min{T, sj^D A (^ alarm) over + p ... s — 2p^ 

^ I 
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On termination we know that both ^ B and I hold. For the alarm example, B 
is aV . Hence, if the loop terminates, we can deduce the following. 

T < oo A al A {al € alarml\^s — p ... min{T, V aZ A s = u)) A 
alarm) over + p ... s — 2p^ 

^ T < oo A al A alarm in f s — p ... A alarm) over + p ... s — 2p^ 

If the loop never terminates, the loop body will be executed at an infinite 
number of progressively increasing times. Hence for any time, t, there exists a 
later time, t' , at which the loop invariant and loop guard hold (8). For the alarm 
example this gives the following. 

T = oo A (V t : Time • (3 t' : Time • t < t' A 

al A {al e alarm(\\-s — p ... min{T, V a/ A s = u)) A 
alarm) over + p ... s — 2p^) @ T)) 

The interesting clause for the nontermination case is the last one concerning 
^ alarm. However, because s is not explicitly related to t' , one cannot deduce the 
desired conclusion of the specification (2), i.e., alarm) over + p ... oo-]. If 
we examine the loop implementation (3), it is clear that at the start of execution 
of the loop body the time must be before s + p, because the deadline of s + p 
later in the body of the loop could not be met otherwise. Hence we may add 
a deadline of s + p at the beginning of the loop body, with no effect at all on 
its overall behaviour, because there is a later (more constraining) occurrence of 
exactly the same deadline. The loop becomes the following. 

{s = itAs— p<r<sA^aZ}; 

do ^ aZ ^ 

deadline s + p; 
delay until s; 
al : read (aZarm); 
deadline s + p; 
s := s + p 

od 

Hence for the infinite iteration case, one may now deduce that for any time, t, 
there exists a later time, Z', at which the command, C, begins execution and 
hence at which both B and I hold. In addition, because of the added deadline, 
we have the additional information that t' < s{t') + p = {t < s + p) @ t' . 

r = oo A (V Z : Time • (3 Z' : Time • 

al A {al G alarml\^s — p ... min{T, V aZ A s = u)) A 
alarm) over fit + p ... s — 2p-] AZ<r<s+p)@ t')) 

Hence for any time, w, such that w + p < w, if we choose Z such that w + Sp < Z, 
then there exists a t' such that the invariant holds at t' and w + ip < t < T , 
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but t' < s + p and hence w + 3p < s + p, i.e., w < s — 2p, and hence from the 
invariant alarm) over + p ... w^. Because this holds for any w, we can 
deduce the desired conclusion: 

T = oo A alarm) over [-m -|- p ... oo-) . 



4.2 Refinement Laws 

The reasoning used above can be used to derive the following refinement law for 
introducing a loop with a terminating body. 

Law 7 (loop with terminating body). Given an idle-stable, Boolean-valued 
expression, B; an idle-invariant predicate, I, not involving tq or initial (zero- 
subscripted) variables; and an idle-stable, time-valued expression, D; then 

oox: [/, (r < oo A ^ B A /) V (r = oo A /oo)] 

□ do B — > deadline B;a;:[BA/AT<B, /] od , 

where /qo = (V t : Time • {3t' : Time •{BAlAt<T<D)@ t')) . 

Finally we would like a more general law that allows the development of a 
loop in which the loop body may not terminate. As before, if the loop body does 
terminate we require that it re-establishes the invariant, I, and hence if the loop 
terminates one can deduce r<ooA^BA/, but if the loop body does not 
terminate then it must establish some other condition, R. In this case the whole 
loop establishes r = oo A B. If the loop does not terminate but the loop body 
always terminates, then as before we can deduce r = oo A /oo. This gives the 
following law. 

Law 8 (loop with nonterminating body). Given an idle-stable. Boolean- 
valued expression, B; an idle-invariant predicate, I, not involving tq or ini- 
tial (zero-subscripted) variables; an idle-stable, time-valued expression, D; and 
a predicate R not involving tq or initial variables; then 

oox: [/, (r < oo A ^ B A /) V (r = oo A (/oo V B))] 

□ doB ^ deadline/); 

ooa;: [B A / A r < /), (r < oo A /) V (r = oo A B)] 

od 

The predicates / and B may not refer to tq or initial variables because / and B 
are used both in the specification, in which tq is the start time of the whole 
loop, and in the body of the loop, in which tq is the start time of an iteration. 
In order to refer to the start time of the whole loop within / or B it is necessary 
to introduce a fresh logical constant to stand for the start time (e.g., u in the 
alarm example). 

Note that Law 7 (loop with terminating body) is a special case of Law 8 
(loop with nonterminating body) with B the predicate false. 
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5 Conclusions 

The primary advantage of the approach taken in this paper is that we develop 
code for a machine-independent real-time programming language, and hence do 
not need to consider the detailed execution times of language constructs as part 
of the development process. This is achieved through the simple mechanism of 
adding a deadline command to our programming language. The approach allows 
the real-time calculus to appear to be a straightforward extension of the standard 
refinement calculus [5]. Of course, the compilation process now has the added 
burden of checking that the deadlines are met [4] . 

This paper has successfully extended this approach to develop possibly non- 
terminating loops, as required for nonterminating control applications. For the 
semantics of the loop we considered three cases: a terminating loop, infinite iter- 
ation, and a nonterminating loop body. Although it is possible to reason about 
loops using the semantics directly (as in Sect. 3.1), as with the standard refine- 
ment calculus, it is advantageous to develop refinement laws that make use of 
loop invariants. The termination case is the same as the conventional refinement 
calculus, with the exception that it is simpler because one does not need to use 
a loop variant to prove that the loop terminates (because it may not). 

The infinite iteration case is the most interesting to deal with. We use the fact 
that, if the loop body always terminates and the loop iterates for ever, then the 
invariant is true at an infinite number of progressively increasing times. However, 
in order to reason about loops in a machine-independent manner, we require that 
the loop invariant be idle-invariant, so that it holds over the executions of the 
guard evaluation and branch back phases of loop execution. This restricts the 
form of the invariant and blurs the link between the invariant and the current 
time variable, r. To re-establish the link between the invariant and the time at 
which the invariant is true, a deadline command can be added to the start of 
the loop body. That leads to our final refinement laws for nonterminating loops. 
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Abstract. In this paper a programming language, qGCL, is presented 
for the expression of quantum algorithms. It contains the features re- 
quired to program a ‘universal’ quantum computer (including initiali- 
sation and observation), has a formal semantics and body of laws, and 
provides a refinement calculus supporting the verification and deriva- 
tion of programs against their specifications. A representative selection 
of quantum algorithms are expressed in the language and one of them is 
derived from its specification. 



1 Introduction 

The purpose of this paper is to present a programming language, qGCL, for 
quantum computation. 

Quantum algorithms are usually described in pseudo code. For semantic sup- 
port there are two models of quantum computation: quantum networks [8,2] and 
quantum Turing machines [7]. The former provides a data-flow view and so is 
relevant when considering implementation in terms of gates; whilst it expresses 
modularisation well, it fails to express (demonic) nondeterminism or probability 
(both features of quantum computation). The latter is appropriate for com- 
plexity analysis but as inappropriate for modularised description and reasoning 
about correctness of quantum algorithms as standard Turing machines are for 
that purpose for standard algorithms. 

With qGCL we introduce an extension of the guarded-command language to 
express quantum algorithms. It contains both (demonic) nondeterminism and 
probability. The former arises in the specification of several quantum algorithms 
(and so in their derivations) and the latter is required in order to ‘observe’ a 
quantum system. qGCL has a rigorous semantics and body of laws as a result 
of other work (on probabilistic semantics; see for example [23]) and so benefits 
from an associated refinement calculus (exhibiting notions of program refine- 
ment, data refinement, containing high-level control structures and combining 
specification constructs with code). Moreover it abstracts implementation con- 
cerns like the representation of assignments as unitary transformations and the 
execution of those unitary transformations as gates. 

After the invention of various efficient quantum algorithms there seems to 
have been a period of consolidation in which frameworks have been sought to 
relate those algorithms. The ‘hidden subgroup problem’ [25] has been seen as a 
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conceptually unifying principle whilst ‘multi-particle interference’ [6] has been 
proposed as a unifying principle closer to implementation. More pragmatically, 
several simulations have been proposed [1,26,31] at various levels of applicability. 

Not surprisingly we take the formal-methods or ‘MPC’ view that a derivation 
is worth a thousand simulations (or more!). Thus in an area containing subtle 
algorithms formal reasoning can be expected to come into its own. One approach 
would be to perform derivations of quantum program in a standard model and 
‘bolt on’ reasoning to cover their probabilistic and quantum behaviour. A more 
elegant alternative would be a single formalism in which all aspects of a quantum 
program’s functionality are reasoned about at once. It might be thought that 
such a formalism would be unwieldy. ^From our experience with probabilistic 
semantics we have found that not to be the case; it has led us to the present 
proposal. 

There have been at least two previous attempts to treat quantum compu- 
tation from a programming-language perspective: Greg Baker’s Q-GOL [1] and 
Bernhard Omer’s Quantum Gomputation Language, QGL [26]. The former pro- 
vides a graphical tool for building and simulating quantum circuits using the 
gate formalism for quantum computation. It does not offer a concise program- 
ming language and is not able to implement and simulate all known quantum 
algorithms. 

Omer’s QGL is a high-level architecture-independent programming language 
for quantum computers, with a syntax very like that of G and an interpreter 
powerful enough to implement and simulate all known quantum algorithms. It 
incorporates neither probabilism nor nondeterminism, has no notion of program 
refinement (and so no refinement calculus) and no semantics; furthermore only 
standard observation is allowed. QGL is appropriate for numerical simulation of 
quantum algorithms, whilst qGCL’s abstraction, rigorous semantics and associ- 
ated refinement calculus seem to make it more suitable for program derivation, 
correctness proof and teaching. 

Only experience will show whether qGCL is pitched at the right level of 
abstraction. However to support that view we here express in it a representa- 
tive selection of quantum algorithms and perform an exemplary, though simple, 
derivation. 

2 Quantum Types 

In this section we study, for use in quantum computation, a transformation q 
that converts a classical type to its quantum analogue. With but one simple 
exception, in section 6.5, quantum algorithms require application of q only to 
registers and so here we restrict ourselves to that case. 

Let B denote the type {0, 1} treated either as booleans or bits, as convenience 
dictates. For natural number n let 0 . . n denote the interval of natural numbers 
at least 0 but less than n 

0 . . n A {i I 0 < i < n} . 
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A (classical) register of size n is a vector of n booleans. The type of all registers 
of size n is thus defined to be the set of boolean-valued functions on 0 . . n 

E" A 0 . . n — >E. 

Naturally we are interested in n at least 1 and identify with B . 

The state of a classical system can be expressed using registers. ^From quan- 
tum theory we learn ^ — for example from Young’s double-slit experiment — that 
the state of a quantum system is modelled using ‘phase’ information associated 
with each standard state. We follow convention [13,27] and represent phase as a 
complex number of modulus at most 1. The probability of observing a state is 
then the modulus squared of its phase; and all probabilities sum to 1. 

That leads to the following definition. 

The quantum analogue of B" is 

g(B’^) A ^ |x(z)p= 1}. (1) 

An element of g(B) is called a qubit [28] and that of g(B") a qureg. 

Classical state is embedded in its quantum analogue by the Dirac delta func- 
tion 



(5 : B" ^ g(B") 

{y) = (y = x). 

The range of S, {5x \ x G B”}, forms a basis for quantum states in this sense: 

any qureg \ • 9 (B") is a square-convex complex superposition of stan- 
dard states 

X = I X{^) P = 1 ■ 

(In physics Sx is denoted by the ket | x) . Our choice of notation has been deter- 
mined by audience background.) 

The Hilbert space B" — ^ C (with the structure making it isomorphic to C ^ ) 
is called the enveloping space of g(B") ; it is the Hilbert space of lowest dimension 
containing (/(B™) as unit sphere. We shall see that, because the elements of the 
range of 6 are pairwise orthogonal in the enveloping space, they are observably 
distinct with probability 1. 



3 Tensor Products 

In a standard programming language the state of a program having indepen- 
dent component program variables can be expressed, more for theoretical than 
practical convenience, as a single variable equal to the Cartesian product of the 

^ In the talk which this paper accompanies the relevant features of quantum theory 
will be introduced in a tutorial manner. 
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components. The quantum analogue is that quantum state is the tensor product 
of its independent state components (equation (2)). In describing algorithms we 
thus have a choice between using individual variables, combining them when 
required (for example by finalisation) using tensor product; and using a vec- 
tor of variables but subjecting it to transformation by the tensor product of 
a particular function on a particular component with the identity function on 
the remaining components. To support both approaches we require the tensor 
product both of registers and of functions. 

The tensor product of (standard) registers is defined 

(g) : B™xB" — 

{x y){i) = X {i div n) X y (i mod n) 

and readily shown to be surjective. That definition lifts, via S and linearity, to 
quantum registers 



(g ; g(B™)xg(B") — > g(B™+") . 

Well definedness (i.e. square-summability to 1) is immediate. 

For sets E and F of quregs we write 

E®F = {x®^\x^ E F} . 

Then the property of q alluded to above is the isomorphism 

g(B™xB") ^ g(B™)(gg(B"). (2) 

(Since both sides are finite-dimensional vector spaces the proof is a matter of 
counting dimension. The left-hand side evidently has basis 

{(<5x,5„) I a; gB"* A j/ G B"} 

whilst a basis for the right-hand side consists of the equinumerous set 
{(5x !g) I a G B™ A j/ G B" } .) 

Next tensor product of functions on registers is defined 

(g ; (B™ — > B"*) X (B” — > B") — > (B™+" — > B™+”) 

{A®B){x®y) = A{x)®B{y). 

Finally (g is extended by linearity to functions on quantum registers, for 
which we follow tradition and use the same symbol yet again 

0 : g(B™ — > B™) X g(B" — > B") — > g(B"*+" — > B"*+”) . 
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4 Probabilistic Language pGCL 

In the next section we introduce an imperative quantum-programming language. 
But first, in this section, we recall Dijsktra’s guarded-command language [11], 
GCL, extended to include probabilism [21,23] and called pGCL. 

Syntax for the guarded-command language consists of all but the last of these 
constructs 



var 




variable declaration 


skip 




no op 


abort 




abortion 


X := e 




assignment 


P9Q 




sequencing 


if [] h — 




conditional 


do [] bi - 


-^S^od 


iteration 


PnQ 




(demonic) nondeterminism 


P r®Q 




probabilism. 



Semantics can be given either in terms of predicate transformers [1 1] or binary 
relations [17]. In the former case each program is thought of as transforming a 
post-condition to the weakest precondition from which termination, in a state 
satisfying that postcondition, is guaranteed. In the latter case each program 
is thought of as transforming initial state to final state, with a virtual state 
encoding non-termination. 

We require the language to be extended, as usual, to embrace procedure 
invocation; see for example [20]. 

pGGL denotes the guarded-command language extended to contain proba- 
bilism. Program P r© Q equals P with probability r and Q with probability 
1— r. Its semantics has been given in two forms, following the semantic styles of 
GGL. The transformer semantics [22] extends pre- and post-conditions to pre- 
and post- expectations: real-valued random variables; the relational semantics [16] 
relates each initial state to a set of final distributions. In either case refinement 
P G Q means that Q is at least as deterministic as P . The two models are 
related by a Galois connection embedding the relational in the transformer [22] . 
There is a family of sound laws [16,23], including those for data refinement, so 
that the language pGGL is embedded in a refinement calculus. It is that feature 
which we exploit. 

In pGGL (demonic) nondeterminism is expressed semantically as the combi- 
nation of all possible probabilistic resolutions 

P n Q = n{P ^© Q I 0 < r < 1} . (3) 

Thus a (demonic) nondeterministic choice between two programs is refined by 
any probabilistic choice between them 



Vr : [0,1] • Pn Q C P^© Q 



(introduce probabilism) . 
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Probabilism does not itself yield nondeterminism: if P and Q are deterministic 
(maximal with respect to the refinement order) then so is P Q ■ Unfortu- 
nately for most authors in the area of quantum computation nondeterminism 
means probabilism. One of the important (and technically difficult) features of 
pGCL is its combination of (demonic) nondeterminism and probabilism; the re- 
sult seems to provide just the right expressive power for the treatment of quan- 
tum algorithms. Indeed of the examples to follow, those of Grover, Shor and 
Deutsch-Jozsa all feature both (demonic) nondeterminism and probabilism. 

If a set E of expressions contains more than one element then in the guarded- 
command language the assignment x E means the nondeterministic choice 
over all individual assignments of elements of E to x. In pGCL that choice is 
interpreted to occur with uniform probability. 

As we need them we introduce two pieces of derived syntax concerning prob- 
abilism: one a prefix combinator (display (6) to follow); the other weakening 
exact probability r in probabilistic choice to the interval [r, 1] (definition (9) to 
follow) . 

5 Quantum Language qGCL 

A quantum program is a pGGL program invoking quantum procedures (described 
below); the resulting language is called qGGL. It is important for us that qGGL, 
being expressed in terms of pGGL, inherits its refinement calculus. That enables 
us to combine code and specifications (and, less of a problem, to benefit from the 
usual liberties in writing programs, like using as guard the predicate ‘A times’) 
since the result has a semantic denotation to which refinement applies. 

There are three types of quantum procedure: initialisation (or state prepa- 
ration) followed by evolution and finally finalisation (or observation or state 
reduction). We now explain each of those three terms. 

5.1 Initialisation 

Initialisation is a procedure which simply assigns to its qureg state the uniform 
square-convex combination of all standard states 

X : 9(»”) 

^^(X) = X := 2"”/^ ■ 

There x is a result parameter. 

Initialisation so defined is feasible in the sense that it is achievable in prac- 
tice [8] by initialising the qureg to the classical state (5q (where 0 denotes the 
register identically false) and then subjecting that to evolution by the (unitary) 
Hadamard transform, defined as a tensor power: 

: g(B")^g(B") (4) 

Ai(x)(:r)©2-i/2(;^(0) + (-l)-x(l)) 

— Hfi 0 Hi 

where exponentiation of bits is standard (—1)^ = — 1 <1 x [> 1 . 
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5.2 Evolution 

Evolution consists of iteration of unitary transformations on quantum state. (It is 
thought of, after initialisation, as achieving all superposed evolutions simultane- 
ously, which provides much of the reason for quantum computation’s efficiency.) 
Again, evolution is feasible: it may be implemented using universal quantum 
gates [3,9]. 

For example on B, after initialisation, evolution by the Hadamard transfor- 
mation Hi results in % = (because Hi is not only unitary but equal to its 
own conjugate transpose and so self-inverse). Thus our definition of initialisation 
does not exclude setting state to equal (or any other standard state for that 
matter). That fact is used in procedure Q in Shor’s algorithm (and similarly in 
Simon’s algorithm, not considered here). 

Later we use this important example of evolution: for function / : B" — )• B” 
between registers, transformation Tf between the corresponding quregs is defined 
pointwise to invert x about 0 if / holds and otherwise to leave it unchanged 

T/:g(B")^g(B") 

{Tfx){x) = {-iy^=^^x{x) = -xix)<f{x)\>x{x). 

Evidently Tf is unitary. 

More complicated evolutions appear in section 6. 

5.3 Finalisation 

Finalisation is a little more difficult to define largely because of the notation 
required. We motivate it by considering first the simple qubit case (later called 
‘diagonal’). 

Simple observation of a qubit % = x(0)(5q-|-x(1)i5i reduces it, by the principles 
of quantum theory, to the standard state Sx with probability | x{x) P, for a: : B . 
Thus it might be expressed, using probabilistic assignment, as a procedure with 
result parameter x 

X : q(B) 

ix ■= ^o) |x(o)P® ix ■= <^i) ■ 

We find it convenient, for more general forms of observation, to conform 
to standard practice and return not just the reduced state (the eigenvector of 
the matrix corresponding to the observation) but also the eigenvalue, in this 
case 0 or 1. At the same time we note that the probability | x(0) P equals the 
inner product of the vector x in enveloping space with its projection on the 
one-dimensional subspace C Sq 

ix,PcSo{x)) = x(0)x(0) =lx(0)P 

where angle brackets denote inner product, Pe{x) denotes projection of x onto 
subspace E and overline denotes complex conjugate. The procedure above then 
becomes 
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x:M,x- 9(B) 

{x, X ■■= 0, (5o) (x.Pcio(x))® X := 1, <^i) ■ 

In that case enveloping space B — >C is the direct sum of the orthogonal subpsaces 
C 6x for a; : B . 

We now extend that simple case from qubits to quregs and from the family 
of subspaces CS^ to a family of arbitrary pairwise orthogonal subspaces which 
span enveloping space. In order to do so it is convenient to use the following 
notation for the probabilistic combination of a list of more than two programs. 

If [ {Pj, Tj) \ 0 < j < m] denotes a finite indexed family of (program, number) 
pairs with X)o<a<m ~ then the probabilistic choice in which Pj is chosen 
with probability Xj is written in prefix form 

(B[Pj@rj I 0 < j < m] (6) 

(whose advantage is to avoid the normalising factors required by nested infix 
form) . 

Let V = \ Vj I 0 < j < m] be an indexed family of pairwise orthogonal 
subspaces which together span enveloping space, 

span [Vj \ 0 < j < m] = B" — )• C , 

where spanE denotes the (complex) vector space generated by any subset E 
of enveloping space. Finalisation with respect to V is defined to consist of a 
procedure which reduces state to lie in one of the subspaces in V, with probability 
determined as it was in the simple case above: 

j : 0 . . m, X : (/(B”) 

Ein[V]{i,x) = ®[ihX-^{j},Vj)@{x,Pvjix)) \ 0<j<m] 

wherein i is a result parameter determining the subspace to which state is re- 
duced and X is a value-result parameter giving that state. In most cases (and 
with good physical reason if the observation is not ‘complete’ — i.e. Vi is more 
than one-dimensional — x is not used, in which case we simply suppress it. We 
include x in definition of finalisation, however, because one of the quan- 
tum algorithms requires it (the last example). That definition provides the law 
‘introduce finalisation’. 

The simple form of finalisation introduced in the qubit case is sufficiently 
important to warrant its own notation. We write A for the indexed family of 
subspaces [CSx \ x G B"] . Then finalisation with respect to A is called diagonal 
finalisation and abbreviated 

X : B" 

Ein[A] (x) . 

Its definition reduces to 



0[a; @ I xi^) P I a; e B" ] 
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and the suppressed value of x is determined by that of x since g(B") n CJj; is 
a singleton. When an output number i : 0 . . 2" is required, it is produced by 
applying to x : B" the function which yields a number, num{x), whose binary 
representation equals its argument 

num : B” ^ 0 . . 2” (7) 

num{x) = Eyo..n2:0')2^ • 

The definition of finalisation accords with general principles of quantum the- 
ory (e.g. [13,18,27]), which permit simultaneous finalisation (or observation) — 
i.e. in either order with the same result — since the subspaces in V are orthog- 
onal. Thus feasibility of that definition is assured by general principles and in 
particular by Jozsa’s characterisation [19] of quantum-observable functions. 

It is interesting to note that finalisation is no more restrictive than proba- 
bilistic choice. Indeed a simple trigonometric argument shows that P r0 Q can 
be achieved by a quantum program which uses 0 only in the form defined by 
finalisation. 

For examples of finalisation we proceed to the next section. 



6 Example Programs 

In this section we demonstrate the expressive power of qGCL by casting in it a 
representative selection of quantum algorithms and their specifications. Although 
it is their efficiency which validates these algorithms, we are interested here in 
formalising functionality. With each algorithm we state the feature of qGCL it 
illustrates. 



6.1 Fair Coin 

The first example is chosen to illustrate initialisation and diagonal finalisation 
without any evolution, and is included as a consistency check. It shows how the 
formalism is able to capture genuine probabilistic behaviour (i.e. not merely that 
of a finite automaton satisfying some fairness condition) . 

The example finds serious application in formalisation of the ‘Mach-Zehnder 
interferometer’ and, in particular, so-called ‘interference-free measurement’ [12]. 
In that setting the following program models a beam-splitter (a half-silvered 
mirror which either transmits or reflects incident photons with equal probability) 
and the Hadamard transform (4) is used for evolution. 

The toss of a fair coin is modelled by specifying the result to be a uniformly- 
distributed boolean: 

var i : B • 

i :e B 
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A quantum implementation is 

var X ■ 9(®); i : B • 

In (x) ? 

Fin[A] (i) 

which may be checked to satisfy its specification since the probability with which 
i = 0 is of course 

|X(0)P= (2-V2)2 = 1. 

A formal proof is immediate from the definitions of initialisation and finalisation. 



6.2 Grover’s Point Search 

The previous program can be proved to meet its probabilistic specification. The 
next example provides a more typical quantum algorithm which achieves its 
naive specification only to within a margin of error. This example thus shows 
how pGCL (and hence qGCL) captures this important type of behaviour. 

The point search problem is: given an array / of 2” bits containing a single 
1, locate it. A program which is correct on every execution is specified without 
any recourse to probability: 



var j : 0 . . 2” • (8) 

A standard algorithm is at best 0(2”) in both the worst and average cases. 
However Grover’s quantum algorithm [14], although correct only to within a 
margin of error e (dependent on the number of loop iterations), is 0(2”/^) in 
both those cases. It is conveniently specified by introducing some derived syntax: 
P>r®Q equals P with probability at least r and otherwise equals Q . It is defined 
(cf. equation (3)) 



P >r0 Q = n{p s 0 <3 k < s < 1 } (9) 

which by a semantic convexity argument [23] simplifies to {P r0 Q) n P. 

The error-prone point-search problem is thus specified to behave, with prob- 
ability at least e, like the naive behaviour (8) and otherwise to terminate with 
an arbitrary value for j 

var j : 0 . . 2” • 

(j :=/-!(!)) >,0(j G0..2”). 
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Grover’s implementation contains evolution expressed as a loop and uses the 
function num (see (7)). 

var X : * : B”, j : 0 . . 2” • 

In (x) ? 

do N times — ;• 

X := Tf(x)l 
X ■■= M(x) 

od 9 

Fin[A\ (i) ^ 
j := num{i) 

There transform Tf is defined by (5) and transform M inverts x (pointwise) 
about its average 

M : g(B") — > g(B”) 

(Mx){x) = 2 [2-" x(y)] - x(a;) ■ 

We are not here concerned with the choice of N which determines the number of 
iterations of the loop. The function e = e{N) is investigated in [4] and its place 
in a semantic (expectation-transformer) proof of correctness is explored in [.5]. 

6.3 Deutsch-Jozsa Classification 

So far the specifications have been (demonically) deterministic (though proba- 
bilistic) and we have used only diagonal finalisation. The next example meets 
a nondeterministic specification, exhibits non-diagonal finalisation and requires 
no margin for error. 

A truth function / : B™ — ^ B is constant iff it takes only a single value. It is 
balanced iff it takes values 0 and 1 equally often 

#/-'(«) =#/-'(!) • 

For use in the next section we note: 

/is constant iff /t/“^(l) g {0,2"}, (10) 

/is balanced iff #/“^(l) = 2”“^, 
and#/-i(l) = ■ 

Any constant truth function / is not balanced. So any truth function is either 
not balanced or not constant, usually both. The Deutsch-Jozsa classification 
problem is to decide, for a given truth function, which holds; if both hold then 
either answer is correct. 

Letting the result be encoded by variable i : B, the problem is specified 
var i :M • 

if i — ;• / not balanced 
[] ^ i — ;• / not constant 

fi 
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A standard algorithm for the Deutsch-Jozsa classification problem is at least 
0(2”) in the worst case and on average evaulates / thrice. DeutschandJozsa’s 
quantum algorithm [10] contains just one evolution step using the transforma- 
tion Tf defined by equation (5). It is expressed in our notation: 



vary : g(B"), i : B • 
In (x) ? 

X := 7>(x)9 
Fin\V] (i) 



where finalisation is non-diagonal 



V = [1/, 1/^] 

V = CEy:«^^ 

V-^ = the orthogonal complement of V . 



A derivation of that algorithm (and hence its correctness) is exhibited in the 
next section. 



6.4 Shor’s Factorisation Algorithm 



Shor’s quantum algorithms [29] for factorisation and for discrete logarithm are 
at once the most mathematically sophisticated and relatively efficient practi- 
cal quantum algorithms known. We consider the former algorithm which, as has 
been widely advertised, makes factorisation feasible by achieving an average-case 
polynomial efficiency instead of the standard exponential. Although it demon- 
strates no new features of qGCL we include it as the most important quantum 
algorithm to date. 

The factorisation problem is: given a natural number n > 1 find a prime 
divisor d of n . It is thus naturally nondeterministic (as was Deutsch-Jozsa clas- 
sification) : 



var d : I . . (n-|-l) • 

d is a prime divisor of n . 



For natural numbers x and y, we write x Li y for their maximum and gcd{x, y) 
for their greatest-common divisor. Shor’s algorithm is 
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var t a, d,p : 0 . . (n+1) • 
i := 

do — >■ 

a :S 2 . . n 9 
d := gcd{a, n ) , 
if rf ^ 1 — )• t ■.= 1 
[] rf = 1 — > Q{a,n; p)^ 

if podd — t ■= 1 
[] p even — r- 

d := gcd{a‘P^^ — l, n) U gcd{aP^^+l, n) 9 
t ■= {d ^ 1) 

fi 

fi 

od 

The quantum content lies in procedure Q. It is our first example to use 
quantum state after finalisation, though it does so for only standard purposes. 

vary : xB™), x : B^™, c : B™ • 

In ix) 9 

X ■“ {Im ® Hm){x) 9 

X := E{x) I 

X {Em ® Im)(x) 9 

Ein[A] {x,x)°^ 

C ■■= Pmix)9 
p := post processing ofc 

where: 

m satisfies n? <2'^ < 2n? ; 

Hm denotes the Hadamard transform defined by equation (4); 

unitary transformation E : g(B'"xB'”) — ^ < 7 (B"*xB"*) is defined in terms of 
modular exponentiation 

E{x){^Ty) = {x,y ® mod n)) \ 

Em ■ (/(B™) — )• g(B™) is the quantum Fourier transform (see [15], section 
3.2.2); 

diagonal finalisation has been extended to return also state x ! 

Pm ■ 5(B’"xB™) — )-B™ denotes a kind of projection 



Em{dx ® ^ 1 and 
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the post processing of c is standard, using continued fractions to find effi- 
ciently the unique p for which 

I num{c)lT^ - rf/p | < . 

The first two lines of procedure Q are equivalent (see section 5.2) to 

(where <5 q denotes the qureg containing m+n zeroes); however our insistence 
that quantum programs begin with a standard initialisation obliges us to take 
the longer version. 

Simon’s quantum algorithm for his masking problem [30] is similar in struc- 
ture, from our current point of view, to Shor’s factorisation algorithm. 

6.5 Finite Automaton 

The previous algorithm uses quantum state after finalisation for the purpose of 
(standard) post processing. However since finalisation was diagonal, the quantum 
state could have been inferred from the eigenvalue returned by finalisation. The 
next example uses non-diagonal finalisation and makes genuine use of quantum 
state after finalisation. It thus justifies our inclusion of state in finalisation. 

Recall that (standard) finite automata, whether deterministic or not and 
one-way or two-way, accept just regular languages. For quantum finite automata 
enough is already known to demonstrate that the picture is quite different 
(see [15], chapter 4). 

A one-measurement one-way quantum finite automaton employs finalisation 
after reading its input string and so is readily expressed in our notation. So 
instead we consider the many-measurement version which employs finalisation 
after reading each symbol on its input string. It turns out (Kondacs and Watrous; 
see, for example, [15] p. 159) that a many-measurement one-way quantum finite 
automaton accepts a proper subset of regular expressions. Here we give sufficient 
of the definition to permit its translation into our programming language. 

For any set A, let S* denote the set of all finite sequences of elements of E. 
Suppose that set S = {Sa, Sr, Sn} of subsets of S* partitions E*. A sequence 
s : E* is said to be accepted by 5 if s S S'a, to be rejected by it if s S Sr and to 
fail classification ii s € Sn- Evaluation of that is specified: 

var i : {a, r, n} • 
s G Si . 

But here we are interested in computing whether a prefix of a given sequence 
is accepted or rejected, since that gives rise to an automaton which continues 
to use its quantum state after finalisation. Its specification thus extends the 
previous one. In it f < s means that sequence f is a prefix of sequence s. 

var i : {a, r, n} • 

( i = a^3t<s»tGSa 
i = r^3t<sutG Sr 
i = n ^ s G Sn 
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(A stronger specification might reflect the fact that computation proceeds from 
left to right and so ensure that sequence t there is smallest.) 

A many-measurement one-way quantum finite automaton is designed to 
achieve such a computation with efficient quantum evolution. It has a finite 
set Q of states, with distinguished acceptance states Qa and rejection states Qr 



Qa^Q 

QrCQ 

QanQr = {}- 

Thus Qa U Qr need not exhaust Q. 

On input sequence s = [(Tq, • • ■ , : E* the automaton evolves succes- 

sively under unitary transformations 

subject to finalisation after each. If a finalisation leaves the automaton in an 
acceptance state then computation terminates with output value j = a; if it 
leaves the automaton in a rejection state then computation terminates with 
output value i = r; but otherwise the automaton reiterates. If it has not accepted 
or rejected a prefix of the input sequence, it terminates when the entire sequence 
has been read, with value i = n. 

We thus let quantum state belong to q{Q), defined as was qureg state by 
(1). Initialisation over q{Q) is defined as for registers; its feasibility is assured by 
solubility of the appropriate simultaneous equations describing a unitary trans- 
formation that yields a uniform image of (5 q. For finalisation we take 

V = [ F„ Vn ] 

Va = span {6a: I X e Qa} 

Vr = span{Sx I X G Qr} 

= ( Fa 0 Vr)^ . 

A program for such an automaton is 

varx : q{Q): & : B • 

Mx) 9 

X, & .= Uinitfx) it) ^ 
do ~^{b V s = []) — )• 

Fin[V] (i,x )9 

if z G {a, r} — ;• & := 1 

D* ^ ^ X, s := Uh.ead{s){x),tail{s) 

fi 

od 

That can be expressed only because we allow quantum state to be returned 
by finalisation. 
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7 Example Derivation 

We conclude by outlining an algebraic derivation of the Deutsch-Jozsa classifica- 
tion algorithm. It is to be emphasised that derivations (or verifications) using the 
refinement calculus (i.e. the laws concerning refinement between programs — see 
for example [20]) are quite different in style from those phrased in terms of seman- 
tics (c.f. [5]). We are interested primarily in the shape of the derivation; and we 
shall see that it is largely standard. This example demonstrates how the refine- 
ment calculus which qGCL inherits from pGCL permits ‘homogeneous’ reasoning 
about the functionality of quantum algorithms, without recourse to arguments 
outside the formalism (pertaining for example to probabilism or ‘quantism’). 

The following derivation can be followed intuitively as well as rigorously; the 
steps involved correspond largely to steps in an informal explanation of why 
the algorithm works. At one point C is extended to mean also data refinement, 
of which care must (in general) be taken in the probabilistic setting; but here 
the refinement is unproblematic. Interesting features of the derivation are the 
appearance of quantum state and of the three quantum procedures. 

var i : B • 

if i — )• / not balanced 
[] ^ i — )• / not constant 

fi 

□ (10) and standard reasoning 
var i :V>, j : 0 . . 2" • 

j ■= E^:B"/(a;)9 
if j yf 2"-i — > i := 1 
D J^{0,2"} ^ z:=0 
fi 

□ 

var i : B, j : 0 . . 2" • 

if je{o,2n 

D J = 

D j^{0,2"-i,2"} 

fi 

□ arithmetic and standard and probabilistic reasoning 
var i :M, j : 0 . . 2" • 

j :=Ex:B"/(2;)9 

if je{0,2"-i,2"} ^ (i :=l)|i_^./2n-i|0(*:=O) 

D j^{ 0 , 2 "-i, 2 "} ^ (i :=l)n(i:=0) 

fi 

□ 



standard reasoning 



i := 1 
i := 0 

{i := 1) n (i := 0) 



injective data refinement k = 1— j/2" ^ 
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var i :M, k : [—1, 1] • 

if fce {-1,0,1} — > (i := 1) |fc|0 (i := 0) 

D {-1,0,1} ^ (*:=l)n(*:=0) 

fi 



C ‘introduce probabilism’ 

var i :V>, k : [—1, 1] • 

(j := 1) |fc|© (* := 0) 

C ‘sequential composition’ 

var i :M, k : [—1, 1], x ^ g(B") • 

k := xi^)9 

{i := 1) |fc|© {i := 0) 

C ‘sequential composition’ and definition of Tj 



var f : B, fc : [—1, 1], x ^ q{M'^) • 

X := 2 9 

X := 7>(x)9 
k := X{x)9 

{i := 1) |fc|© {i := 0) 

C definitions of In, Fin and diminish by A: = (x,2“"'^^ { — 

var i : B, X : 9(B") • 

In (x) 9 
X := 7>(x)9 
Fin[V] (i) 



with family V = [V , V-^], where V = C ■ 

8 Conclusions 

We have proposed a language, qGCL, for the expression of quantum algorithms 
and their derivations. It exhibits several features, many as a result of the work 
on pGCL: 

1. expressivity: the language is sufficiently expressive to capture existing quan- 
tum algorithms 
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2. simplicity: the language seems to be as simple as possible (by law (3)) whilst 
containing (demonic) nondeterminism and probability (the latter either in a 
form restricted to ‘observation’ or for general use) 

3. abstraction: the language contains control structures and data structures 
at the level of abstraction of today’s imperative languages whilst abstract- 
ing implementation concerns (like the representation of a function / on the 
underlying standard types by its Lecerf-Bennett form on their quantum ana- 
logues) 

4. calculation: the language has a formal semantics, sound laws and provides a 
refinement calculus supporting verification and derivation of quantum pro- 
grams 

5. the language provides a uniform treatment of ‘observation’. 

We conclude that it does seem possible to treat quantum programs in a 
refinement calculus with the same degree of elegance and rigour as standard 
algorithms. Starting from a specification in which standard and probabilistic (but 
not quantum) expressions appear it is possible to derive quantum algorithms by 
introducing algorithmic and quantum structure (in the guise of quantum state 
and the three quantum proceedures) . 

We have still to learn whether there are reuseable data refinements, or other 
derivation cliches, appropriate to the derivation of quantum programs. But ab- 
straction from implementation concerns seems to make quantum algorithms eas- 
ier to express, understand and reason about. Unmentioned here are more general 
properties of the functor q on types and work on the compilation of the programs 
expressed in qGCL ([32]). 
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Abstract 



Regular expressions are a standard means for denoting formal languages that are 
recognizable by finite automata. Much less familiar is the use of syntactic expres- 
sions for (formal) power series. Power series generalize languages by assigning to 
words multiplicities in any semiring (such as the reals) rather than just Booleans, 
and include as a special case the set of streams (infinite sequences). Here we shall 
define an extended set of regular expressions with multiplicities in an arbitrary 
semiring. The semantics of such expressions will be defined coinductively, al- 
lowing for the use of a syntactic coinductive proof principle. To each expression 
will be assigned a nondeterministic automaton with multiplicities, which usually 
is a rather efficient representation of the power series denoted by the expres- 
sion. Much of the above will be illustrated for the special case of streams of real 
numbers; other examples include automata and languages (sets of words), and 
task-resource systems (using the max-plus semiring). The coinductive definitions 
mentioned above take the shape of what we have called behavioural differential 
equations, on the basis of which we develop, as a motivating example, a theory 
of streams in a calculus-like fashion. 

Our perspective is essentially coalgebraic. More precisely, the set of all formal 
power series, including the set of languages and the set of streams as special 
instances, is a final coalgebra. This fact is the basis for both the coinduction 
definition and the coinduction proof principle. 

For general background on coalgebra, see [I] and [2]. The proceedings of the 
recently established workshop series CMOS (Coalgebraic Methods in Computer 
Science), contained in Volumes 11, 19, and 33 of Elsevier’s Electronic Notes in 
Theoretical Computer Science, give a good impression of many of the latest de- 
velopments in coalgebraic studies. References related to the theory summarized 
above are [3], dealing with automata and languages, and [4], on formal power 
series. A technical report on behavioural differential equations is in preparation. 
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Abstract. It is possible, but difficult, to reason in Hoare logic about programs 
which address and modify data structures defined by pointers. The challenge is 
to approach the simplicity of Hoare logic’s treatment of variable assignment, 
where substitution affects only relevant assertion formulae. The axiom of 
assignment to object components treats each component name as a pointer- 
indexed array. This permits a formal treatment of inductively defined data 
structures in the heap but tends to produce instances of modified component 
mappings in arguments to inductively defined assertions. The major weapons 
against these troublesome mappings are assertions which describe spatial 
separation of data structures. Three example proofs are sketched. 



1 Introduction 

The power of the Floyd/Hoare treatment of imperative programs [8][1 1] lies in its use 
of variable substitution to capture the semantics of assignment: simply, , the result 
of replacing every free occurrence of variable .x in R by formula E, is the precondition 
which guarantees that assignment x: = E will terminate in a state satisfying At a 
stroke difficult semantic questions that have to do with stores and states are converted 
into simpler syntactic questions about first-order logical formula. 

We encounter several difficulties when we attempt to use a similar approach to 
deal with programs which manipulate and modify recursive data structures defined by 
pointers. The first difficulty, whose solution has been known for some time, is 
aliasing: distinct and very different pointer formula may refer to the same object. The 
second difficulty is the treatment of assertions which include inductive formulae 
describing heap data structures. The final difficulty is the complexity of the proofs: 
not only do we have to reason formally about sets, sequences, graphs and trees, we 
have to make sure that the locality of assignment operations is reflected in the 
treatment of assertions about the heap. 

For all of these reasons, Hoare logic isn’t widely used to verify pointer programs. 
Yet most low-level and all object-oriented programs use heap pointers freely. If we 
wish to prove properties of the kind of programs that actually get written and used, we 
shall have to deal with pointer programs on a regular basis. 



* I neglect definedness conditions throughout this paper. 



R. Backhouse and J. N. Oliveira (Eds.): MFC 2000, LNCS 1837, pp. 102-126, 2000. 
(c) Springer-Verlag Berlin Heidelberg 2000 
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In one situation program verification is a practical necessity. The idea of ‘proof- 
carrying code’ (see, for example, Necula and Lee [22] and Appel and Felty [1]) is that 
programs should be distributed along with a proof of their properties, the proof to be 
checked by each user before the program is used. Machine checkers are simple, 
reliable and fast, so if appropriate proofs can be provided we won’t have to spend 
ages reading the fine print before running our latest downloaded system extension. 

Proof-carrying code is a long way off, but being able to deal effectively with 
pointer programs will take us a step along the way. 

This paper is therefore about verifying the properties of programs. It is not about 
proof development, but it is not intended as a challenge to those who prefer program 
refinement to program verification. Once we can make reliable proofs about 
imperative pointer algorithms, surely the mechanisms developed to support proof can 
be used in aid of other activities. 



1.1 Background 

This paper was inspired by the work of Morris, who put forward in 1981 [21] axioms 
for assignment to object components, described a mechanism for dealing with 
inductively defined data structures and presented a semi-formal proof of the Schorr- 
Waite graph marking algorithm. Earlier still, Burstall presented in 1972 [6] a 
treatment of list-processing algorithms and pointed out the analogy between the 
treatment of array element and object components; his Distinct Non Repeating List 
Systems made possible succinct and convincing proofs of list and tree algorithms. 
Recently Reynolds [24] revisited Burstall’ s work, refining the treatment of spatial 
separation between objects and data structures in the heap and extending its range of 
application, but working with forward rather than backward reasoning. 

Attempts have been made to apply Hoare logic to pointer programs by 
incorporating a model of the store, or part of the store, into the assertion logic. 
Luckham and Suzuki [19], Leino [18] and Bijlsma [3], for example, identify a 
subsection of the heap with each pointer type. Kowaltowski [16] includes the entire 
heap. 

Other work recognises the analogy between array element and object component 
assignment but doesn’t do so effectively: both Hoare and Wirth [13] and Cries and 
Levin [10], for example, give an axiom for object-component assignment which deals 
only with the simplest non-pointer cases and neglects entirely to deal with pointer 
aliasing. 

Cousot [7] gives a brief survey of other work, most of which is semantic in 
character and does not address the practical concerns of program verification. 



1.2 Notation 

I write A, V, — > for logical implication, conjunction, disjunction and negation; = 
means equal by definition; @ is append (sequence concatenation); (f\ is disjointness 
of sequences; g is sequence and/or set membership. Ep is the result of substituting 
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formula F for every free occurrence of variable x in formula E. A® B E is a 
mapping which is everywhere the same as A, except at B which it maps to E. I use 
to define /-linked sequence data structures from section 6 onwards. 



2 The Problem of Aliasing 

The Hoare logic treatment of assignment is sound when we are sure that distinct 
variable names refer to distinct storage locations; it can mislead if that assurance is 
lost. Aliasing, which occurs when distinct formulae describe the same storage location, 
comes in at least the following range of flavours. 

• Parameter aliasing is the best known and probably the most difficult to deal with. 
It arises during the execution of procedures and functions, when a call-by-reference 
parameter (a var parameter in Pascal [14], for example) stands for a storage 
location outside the procedure which is also nameable in another way from within 
the procedure. 

• Subscript aliasing arises in languages which use arrays: b\l\ and b\J^ are the 
same storage location just when I = J as integers. 

• Pointer aliasing, analysed below, arises when an object (a node, a record) can be 
referred to indirectly via a pointer value.* 

• Overlap aliasing occurs when storage locations can contain storage locations, as 
when an object is updatable by assignment, simultaneously updating all of its 
components, and those components are each separately updatable storage 
locations. 

• View aliasing occurs when the same area of store can be addressed in different 
ways, as with Pascal variant records, C [15] union types, or C casting. 

Aliasing is caused by identity or overlap of lvalues (addresses of objects in the 
store, see Strachey [26]), but both subscript and pointer aliasing can be dealt with by 
comparing rvalues (contents of storage locations). That makes them in principle 
tractable in the Floyd/Hoare tradition, which deals entirely with rvalues. Parameter, 
overlap and view aliasing are outside the scope of this paper. 



2.1 Subscript Aliasing 

Distinct formulae indexing the same array will be equivalent as lvalues if the rvalues 
of their subscript expressions are equal. The Pascal program fragment 

b[i] :=b[j]+l; 

if b[i]=b[j] then writeln( output, "aliased!") 
else writeln( output, "distinct") 



* In languages such as C [15] which allow pointers to stack components, pointer aliasing is 
used to imitate parameter aliasing. The obvious implementation of call hy reference depends 
on the use of pointers. Pointer aliasing and parameter aliasing are, therefore, closely related. 
But at source language level they are distinct phenomena. 
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will print aliased ! when i = j , or distinct when i<> j . 

Following McCarthy and Painter [20] and Hoare and Wirth [13], there’s a well- 
known solution to subscript aliasing. Even though, in every practical implementation 
of a programming language, an array is a collection of separately addressable storage 
locations, it can be treated as if it was a single variable containing a mapping from 
indices to rvalues. The instruction b\l\. = E is considered to assign a new mapping 
b®I\-^E to h, and therefore Rb^i^^E *^he precondition that the assignment 
b\l\. = E will terminate in a state satisfying R. Array element access is resolved by 
comparing indices, so that [b® I \-^ ^)['^] = (if / = / then E else b\J^ fi) . 

This interpretation of array element assignment is a complete solution to the 
problem of subscript aliasing, though it must be used carefully with concurrent 
assignments in case there are aliases in the set of assigned locations. It resolves the 
problem entirely by the rvalue comparison I = J in the reduction rule, even though 
aliasing is between lvalues b\l\ and b\J^ . 



2.2 Pointer Aliasing 

In practice a computer memory is a giant array, and memory addresses - the primitive 
mechanism underlying lvalues and pointers - are its indices. Whenever two distinct 
occurrences of pointer values are equal, we have pointer aliasing. 

Consider, for example, the problem (part of the in-place list-reversal example 
below) of moving an element from the head of one fZ-linked list r to the head of a 
similar list p, using assignment to heap object components. If we write it as a 
sequence of non-concurrent assignments' we must use an auxiliary variable q and a 
sequence of at least four instructions, one of which alters a pointer in the heap. This is 
one solution: 

q: = r;r: = r.tl; q.tl: = p; p: = q 

At every intermediate stage of execution of this program there is more than one way 
to refer to particular objects in the store. After the first assignment, q and r are the 
same pointer; after the second, q.tl and r; after the third, q.tl and p; after the last, p and 
q. If the p and r lists aren’t disjoint collections of objects, still more aliasing may be 
produced. 

It’s tempting to treat the heap as a pointer-indexed array of component-indexed 
objects, perhaps subdividing it by object type into sub-arrays (see, for example, 
Luckham and Suzuki [19] and Leino [18]). But in practice this has proved awkward 
to deal with. First, it means there has to be a translation between object-component 
formulre in the program on the one hand and array-indexing formulre in assertions on 
the other. Second, it forces us towards ‘global reasoning’: every object component 
assignment seems to affect every assertion which has to do with the heap. By contrast 
the Floyd-Hoare treatment of assignment to a variable concentrates attention on 
assertions that involve that variable, leaving others untouched, making steps of ‘local 
reasoning’ whose restriction to particular formulae matches the locality of assignment 



* I do not consider concurrent assignment in this paper. 
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to a single variable. Third, our assertions about the content of the heap usually need to 
be expressed inductively, using auxiliary definitions which can easily hide aliases. 

Even though these difficulties might in principle be overcome, it feels like the 
wrong thing to do: we don’t program with a memory array in mind but rather in terms 
of distinct variables, arrays, objects and components, and our reasoning should as far 
as possible operate at the same level as our thinking. 



3 The Treatment of Subscript Aliasing 

In [21] Morris introduced an assignment rule for assignment to components of heap 
objects which generalised BurstalTs treatment [6] of hd/tl structures. He gave a 
treatment of inductively defined data structures using the notion of paths between 
objects, and presented a proof of the Schorr-Waite graph-marking algorithm [25] 
which seems far closer to a proof of the original than other published treatments 
[9][28]. 

3.1 Morris’s Treatment 

Morris treats a language which is like Java [2] in that it has pointer-aliasing but no 
parameter-aliasing, and no whole-object assignment. Program (stack) variables like x, 
y, p, q can hold pointers to objects in the heap. Objects (in the heap) have components 
indexed by names like e, f g, h and are referred to using dot-suffix notation, for 
example p.e.e.f. 

Because there is no whole-object assignment, aliasing is only of components of 
objects. Because there is no arithmetic on component names we know immediately, 
when e and/are distinct component names and no matter what the values of A and B, 
that A.e can’t be the same lvalue as B.f. Similarly, we know that A.e and B.e will be 
the same lvalue just when the rvalues A and B are equal.* Lvalue-aliasing of object 
components can therefore be detected by identity of component name and rvalue 
comparison of pointers. 

These insights made it possible to define a rule for object-component assignment 
which avoids mentioning the memory array altogether: 

Q ^ 4 '^'^ 

{Q}A.f: = E{R} 

Object component substitution is just like variable substitution except when 

it is dealing with object component references. Morris’s axioms appear to be: 

ttxT (/and g distinct) -T3-7 

{B.g)^-^ = B.g = if A = B then E else B.f fi 

Formally, this treatment deals only with single occurrences of component names in 
object component formulae - p.hd and p.tl are dealt with correctly, for example, but 

* I assume, for simplicity, that distinct types of object use distinct component names. 
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p.hd.tl isn’t - so it is valid only with a restricted programming language and a 
restricted vocabulary of assertion logics. There are similar axioms for assignment to 
array elements, similarly flawed - b\i\ is dealt with, but not . * 



3.2 Calculating Object-Component Assignment Axioms 

It is possible to calculate object-component substitution axioms which work without 
restriction in Morris’s target language. Since objects can’t overlap, we can treat the 
heap as a pointer-indexed collection of objects, each of which is a name-indexed 
collection of components. An object-component reference A.f in a heap H 
corresponds to a double indexing, once of the heap and once of the object. 

[a./]]h = h[[Ia]]h][/] 

Assigning a value to A./ replaces the object H[A] with a new mapping and therefore 
the heap becomes a new mapping as well: 

H H = d H', where H' = ^H©|[[a]] hlj |lH[[[A]]Hj©/i-^ d^I njj 

When /and g are distinct component names: 

l{B.g)fq H = II B.gl H' = H'[d sI H'][g] = H'[[[ B'/ q Hjg] 

= {h@(1 a] h) ^ (h [d a1 h] © / ^ d £]] H)|d B'/ q H][g] 

= j^if d a] H = d B^E'^ q H then (H[d A] h] © / ^ d h) else H[d h] fi^g] 

= if d a] H = d B'^ q H then |lH|d^]] h| © / d Hj[g] else H^d B'^ q h][.?] fi 
= if d a ] H = d B'^ q H then H^d^]] Hj[g] else H^d B'^ q Hj[g] fi 
= if d a] H = d B'^ q H then H^d B'^ q H][g] else H^d B'^ q H][g] fi 

With identical component names: 

d(S./)'/^Il H = dfi./I H' = H'[ds] H'][/] = H'[d5^'"l hJ/] 

= (^H © (d A] h) ^ (H[d A] h] © / ^ d H)|d B'E^ q hJ/] 

* In examples Morris deals with multiple occurrences of component names or array names by 
sometimes introducing nested substitutions, sometimes introducing constants - replacing 
= j ’ for example, by b\b0\ = j a h[j] = bO. It may be that his less than formal 
statement of the mechanism is misleading. 
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= (^if [ a] H = [[ H then (h[|I A] h] © / ^ [[ £] h] else H[[[ h] fij[/] 

= if [ a] H = [[ H then (H [[ A] h] © / ^ [[ £]| hJ[/] else H [[ H ][/] fi 

= if [ a] H = [[ B'E^ fl H then [[ £] H else ^)./]H fi 
= H if A = B'l^-f then E else )./ fij H 

In each case the calculated equivalence leads to an axiom which is identical to 
Morris’s except that B on the right-hand side of the axiom is replaced by b'^'^ . 



3.3 The Component-as- Array Trick 

The axioms for object component substitution, following an assignment A.f: = E, for 
distinct component names /and g, are 



{B.g)f^ = (B^/-f).g {B.f)f^ = if A = b'/-^ then E else fi 

The standard treatment of an assignment b\l\. = E gives us for distinct arrays b and c 

= if / = then E else ^ 

It is clear from the correspondence between these treatments that object-component 
substitution is formally equivalent to a treatment of object components as pointer- 
indexed arrays. That is, assignment to component/of an object pointed to by A can be 
treated as if it were an assignment to the A-indexed component of an array / and 
access to the / component of an object pointed to by A as selection of the Ath 
component of an array f. 

This observation is certainly not novel: Burstall gives an equivalent in [6], and it 
may be that Morris intended this reading in [21]. It is worth stating clearly, however, 
to clarify what seems otherwise to be imperfectly understood ‘folk knowledge’ . 

The advantages of the component-as-array treatment are considerable. First of all, 
it is obvious that it enables the calculation of weakest preconditions. Second, it means 
that we don’t need a new structural induction to deal with object component 
substitution: a considerable advantage when mechanising proof and proof-checking. 
Finally, and most importantly in the context of this paper, it makes possible a formal 
treatment of object component substitution into inductively defined formulae. 

I feel it’s necessary, despite its advantages, to emphasise that the treatment is 
doubly a trick. It’s a trick built on the array-as-mapping trick of McCarthy and 
Painter. It’s violently at odds with our understanding of how heaps work in practice. It 
isn’t clear how much of it would survive a relaxation of the restrictions we have 
imposed on our programming language. 
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R'^R 

{Q]x-. = E{R] {Q}A.f-. = E{R} {G}5{^} 

Q^P Pa^B^R PAB^t>0 {PABAt = vt}S{t<vt} 

{2} while do S od {/?} 

{Q A B]S,t,,„{R] [Q A [Q]S1[Q'] [Q']S2{R] 

{2}if B then else fi{/?} {Q]SP, 52{/?} 

Fig. 1. Hoare-triple rules 



3.4 Restricted Global Reasoning 

If we treat the heap as a global array, and convert all our object component 
assignments to heap-array element assignments, then we have global reasoning, 
because substitution of a new mapping for the heap array affects every assertion about 
any part of the heap. By contrast the Floyd-Hoare treatment of variable assignment 
gives us local reasoning: only those assertions which mention the assigned variable 
are affected by substitution. 

Object component substitution and the component-as-array trick each offer a 
restricted global reasoning. Some locality is achieved effortlessly, when formulte 
which only mention component/ aren’t affected by assignment to component g. But 
on the other hand the interaction of assignment with inductive definitions, discussed 
below, needs careful treatment if a global reasoning explosion is to be avoided. 



4 Hoare-Triple Rules 

I consider a language which has assignment to variables and object components, 
while-do-od, if-then-else-fi and instruction sequence (fig. 1). The usual caveats apply 
to the while-do-od rule: vf must be a fresh variable, t must be an integer-valued 
function of the state. It is useful to include a postcondition-strengthening rule. 
Component indexing A.{B © C is) is equivalent to (if A = C then E else A.B fi). 

I omit from the rules anything which requires definedness of formulae. Definedness 
is especially important in pointer-manipulating programs, because nil is such a 
dangerous value - often in use, of pointer type, and yet not dot-suffixable. 
Nevertheless, it would add little to the discussion in this paper to deal with 
definedness, if only because my examples don’t attempt to traverse nil pointers. 



5 Three Small Examples 

Cycles in the heap show pointer aliasing in its rawest form. Three small mechanical 
proofs, calculated in Jape [5], are shown in full detail except for trivial implications. 
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1 :p.b=3^p.b=3 {^-i.hyp} 

2:p.b=3-»p.(a®pl^p).b=3 A.(B®A»E)aE 1 

3:p.b=3^p.(a®pl^p).(a®pl^p).b=3 A.(B®A»E)iEZ 

4:p.b=3^p.(a®pE^p).(a®pE^p).(a®P'^P)-b=3 A.(B®A»E)aE3 
5:{p.b=3}(p.a:=p){p.a.a.a.b=3} :=4 

Fig. 2. A single-step cycle 

1 :p.d=3->p.d=3 
2:p.d=3^q.b.(c®q.bi->p).d=3 
3:p.d=3->p.(a®pi-^q).b.(c®q.bH^p).d=3 
4:p.d=3^q.b.(c®q.bH’p).(a®pH^q).b.(c®q.bH^p).d=3 
5:{p.d=3}(q.b.c:=p){q.b.c.(a®pi^q).b.c.d=3} 
6:q.b.c.(a®pE^q).b.c.d=3->q.b.c.(a®pi-^q).b.c.d=3 
7:q.b.c.(a®pE^q).b.c.d=3->p.(a®pi^q).b.c.(a©pi^q).b.c.d=3 
8:{q.b.c.(a®pi-^q).b.c.d=3}(p.a;=q){p.a.b.c.a.b.c.d=3} 
9:{p.d=3}(q.b.c:=p;p.a:=q){p.a.b.c.a.b.c.d=3} 

Fig. 3. A multi-step cycle (version 1) 

1 :p.d=3^p.d=3 
2:p.d=3^q.b.(c®q.bH^p).d=3 
3:p.d=3-^p.(a®pi^q).b.(c®q.bE^p).d=3 
4:p.d=3^q.b.(c®q.bH^p).(a®pi^q).b.(c®q.bi^p).d=3 
5:p.d=3^p.(a®pi^q).b.(c®q.bi-^p).(a®pi^q).b.(c®q.bE^p).d=3 
6:{p.d=3}(p.a:=q){p.a.b.(c®q.bH^p).a.b.(c®q.bH^p).d=3} 
7:p.a.b.(c®q.bH^p).a.b.(c®q.bi-^p).d=3->p.a.b.(c®q.bi^p).a.b.(c®q.bH’p).d=3 
8:{p.a.b.(c®q.bH^p).a.b.(c®q.bE^p).d=3}(q.b.c:=p){p.a.b.c.a.b.c.d=3} 
9:{p.d=3}(p.a:=q;q.b.c:=p){p.a.b.c.a.b.c.d=3} 

Fig. 4. A multi-step cycle (version 2) 

First a single-step cycle established by a single assignment. 

[p.b = 3}p.a: = p[p.a.a.a.b = 3} 

The proof (fig. 2) consists of a use of the assignment rale (line 5) followed by three 
applications of component-indexing simplification on lines 4, 3 and 2. 

Next, a multi-step cycle established by a sequence of two assignments. 

{p.d = 3}q.b.c: = p\ p.a: = q{p.a.b.c.a.b.c.d = 3} 

The proof (fig. 3) consists of an application of the sequence rule (line 9), two 
applications of the assignment rule (lines 8 and 5) and various uses of component- 
indexing simplification. Note that we can’t simplify q.b.c.{a ® p\-^ q) on line 6, but 
the mapping can be eliminated on line 3, once q.b.c has been simplified to p. 

If we make the same cycle with the same instructions, but executed in the reverse 
order (fig. 4), neither of the mappings generated by the assignment rule on line 8 can 
be eliminated immediately, but they are dealt with eventually, on lines 4 and 2. 



!^-l,hyp} 

A.(BffiAME)iE 1 

A.(BffiAwE)iE2 

A.(BffiAwE)iE3 

A.(BffiAME)iE4 

:=5 

!^-l,hyp} 

:=7 

sequence 6,8 



{^-l,hyp} 
A.(B©Ab^E)=E 1 
A.(B©Ab^E)=E 2 
A.(B©Ab^E)=E 3 
:=4 

{^-l,hyp} 
A.(B©Ab^E)=E 6 
:=7 

sequence 5,8 
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6 Substitution and Auxiliary Definitions 

When we make proofs of programs which manipulate heap objects using pointers, our 
assertions will usually call upon auxiliary definitions of data structures. Auxiliary 
definitions are useful for all kinds of reasons: they allow us to parameterise important 
assertions in user-defined predicates, they let us define inductive calculations, they 
shorten descriptions. But the Floyd/Hoare mechanism, even in the absence of pointer 
aliasing, has difficulty with assertions which contain arbitrary auxiliary defined 
formulae (henceforth adfs). 

If, for example, we define a predicate 

F(z) = {x = z) 

then F(y) both asserts that y has the same value as x and contains an implicit 
occurrence of x. Substitution deals only with explicit occurrences, so the variable 
assignment rule would seem to allow us to conclude mistakenly 

{F(y)}x: = ^ + l{F(y)} 

The difficulty would disappear if we were to insist that adfs are expanded from 
their definitions before we use substitution. That would not merely be inconvenient 
but impossible in general, because inductive definitions can expand indefinitely. If we 
insist, however, that definitions are program-variable closed - that is, that they have 
no free occurrences of variable names which can be the target of substitution - then 
we can deal with them in unexpanded form. 

For example, we might define 

F(m, v) = u = v 

and the assignment rule would correctly calculate 
{F(jc-Hl,y)}x: = x-Hl{F(x,y)} 

The problem of auxiliary definitions of heap data structures is not quite so easy to 
solve. Reynolds [23] deals with assertion procedures, but his intentions and his 
approach are quite distinct from that developed below. 



6.1 Inductive Auxiliary Definitions of Data Structures 

Pointer-linked data-structure definitions are usually inductively defined. Indeed it is 
difficult to see how we could do without induction when specifying programs which 
manipulate data structures via pointers. Making them program-variable closed doesn’t 
deal with object component assignment, but the component-as-array trick allows us to 
make them component-array closed as well. 

Following Burstall, Morris and Reynolds, an example data structure which uses a 
component array parameter is the sequence of objects* A S generated by starting 



* It’s helpful to think of this formula as generating a sequence of objects, but of course it 
actually generates a sequence of pointer values. 
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A =>y S = if A = S then ( ) else (A) @(a./ b) fi 
Fig. 5. An inductive definition of f-linked sequences in the heap 
P 




Fig. 6. The sequence p nil 



P ? 




Fig. 7. The sequence p q 

at the object pointed to by A and following links in /components until a pointer equal 
to B is encountered. 

For various reasons (not least the fact that nil doesn’t point to an object), B isn’t 
included in the sequence. The definition is both program-variable and component- 
array closed: it mentions nothing but constants and its parameters A, B and/. The 
sequence it generates isn’t necessarily a list - that is, a finite sequence with no 
repetitions - because cycles in the heap may mean that we never reach B. It will not 
be defined if we attempt to traverse a nil pointer before reaching B. 

If we consider objects with hd and tl components, as in the list reversal and list 
merge examples below, then p nil can represent what we normally think of as 
‘the list p’ (fig. 6). If p and q point to distinct objects, then p q describes a list 
fragment, the sequence of objects linked by tl components starting with the one 
pointed to by p up to but not including the one pointed to by q (fig. 7). In each of 
these examples I have assumed that the data structure is acyclic. 
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P q 




Fig. 8. Effect of the assignment p.tl : = q on the sequence p =>^i q 



1 Spatial Separation, Global Reasoning and Inductive Definitions 

What imperative graph, tree and list algorithms all seem to do is to work with disjoint 
bits of data structures, nibbling away at the edges, moving a node from here to there, 
swinging pointers, altering values in exposed components. The locality of assignment 
means that changes made in one area of the heap don’t affect objects elsewhere. But 
because pointer aliasing is always a possibility, our reasoning must continually take 
into account the possibility that the location we are altering is referred to under 
another name. In practice we take that possibility into account most often in order to 
dismiss it. Our logic must make this easy to do, and we must choose our assertions to 
exploit this capability. 

Substitution into x^fj, as a result of the assignment A.f: = E, generates 
.X y. That expands to give a formula which contains indefinitely many 

occurrences of the mapping /©Ah^Zs, in x.[f®Ai-^E), 
x.[f © A ^)-{f © A h- > £) , and so on. This explosion of effects, produced by an 
assignment which affects only a single location, must either be avoided or effectively 
dealt with. It arises with any inductive auxiliary definition which has a component- 
array parameter. 

By contrast, assignment is operationally local. Consider, for example the p q 
data structure of fig. 7. Executing p.tl: = q will swing the pointer in the first box to 
point at the last, changing the data structure to that in fig. 8. 

Only one component of the heap changes because of the assignment. The challenge 
is to imitate this simplicity in our reasoning, to avoid an explosion of mappings and 
combat what Hoare and Jifeng [12] call ‘the complexity of pointer- swing ’ . 

In this particular case the result of substitution is easy to unravel: 

{p ^tl ^)tWp^q = P '? 

= if p = q then ( ) else (p) @ p.{tl 9 p\-^q) q fi 

= if p = q then ( ) else (p) @ q ^ 
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= if p = q then ( ) else (p) @( ) fi 

= if p = q then ( ) else (p) fi 

Not every example is so straightforward. Consider the effect of the assignment 
x.tl: = y on the postcondition x nil a list(jC nil), where the list predicate* 
asserts that a sequence is finite and non-repetitive. We may proceed with substitution, 
expansion and simplification as before: 

[x ^ nil A list(jc =x^ni\A list(x nil) 

= x^ni\ A list^if x = nil then ( ) else (x) @ x.{tl © x y) ^ti 9 xi-^y nil fi) 

= X nil A list^if x = nil then ( ) else (x) @ y ^ti®xh^y nil fi) 

= X nil A list((x) @y nil) 

Without further information we can go no further, but we can be sure that this is the 
weakest precondition, because all we’ve done is to use component-as-array 
substitution, expand a definition and evaluate a conditional. 

We’ve by no means achieved local reasoning yet, because there is a mapping 
tl® x\-^ y in the argument to an inductively defined auxiliary formula. Observe, 
however, that when x doesn’t point to any of the components of y=>j;nil, 
assignment to x.tl can’t affect the meaning of that formula. In those circumstances, 
therefore, y nil = y nil for any formula E. (This is easily shown 

formally by induction on the length of finite sequences.) We might, that is, be content 
to assume separation of objects in the heap, and to prove the simpler precondition 
x^m\ Axiy nil a list((x) @ y =>,, nil) 

Note that the assumption ought to hold, because if x does point into y nil - 
that is, if y nil y x@x^,j nil - then we don’t have a non-repeating 
sequence and the precondition is false: 

X nil A list((x) @ y nil) 

= X nil A list((x) @y nil) 

= X nil A list((x) @y^ax @{x) @ y ^a®x^y nil) 

What’s being appealed to by adding x g y =>,; nil to the precondition, thereby 
eliminating a mapping, is the principle of spatial separation. There is a set of objects 
whose components help define the meaning of y nil , and if the object x points to 
isn’t one of that set, then the assignment x.tl: = E can’t have any effect on y nil. 

In practice spatial-separation assertions like x i y =>,; nil aren’t plucked out of the 
air - they are found in invariants, as shown in the list reversal, list merge and graph 
marking examples below. 



* Defined by heap-independent axioms: see section 9.1. 
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affected by assignment 
to tl component 



affected by assignment 
to hd component 



the sequence p q 

Fig. 9. Offset sequence with different support sets for different components 



What’s surprising is that lists and sequences aren’t a special case. Spatial 
separation is exactly what we need to deal in general with inductively defined heap 
data structures. Any data structure in a finite heap depends on the component values 
of a finite set of objects. If we assign to a component of an object outside that set, the 
data structure can’t be affected. Faced with an adf whose arguments include a 
mapping B® A\-^ E , therefore, we can either expand it until all the affected objects 
are exposed - as, for example, we expanded x ^tmx^y ^il above - or we can show 
that because of spatial separation the mapping is never used, no matter how far we 
expand the formula - as in the case of y ^ti®xv^y fil above - or we can do both 
steps, one after the other - or we are stuck for the moment, and we have to leave the 
mapping alone. 



7.1 Spatial Separation Examples 

Spatial separation of a data structure and an object A can always be expressed as 
Ai S , where S is a set of objects whose component values define the data structure. 
In straightforward cases, like the definition of fig. 5, the set of objects is just the 
data structure itself. 

Offset Sequences. Consider the definition 

B = if A = B then ( ) else i^A.f.f) @ A.g g Bfi 
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The formula p q describes a sequence of objects not necessarily linked 

together, and offset by two steps from p q (fig. 9). Assignment to A.hd or A.tl 
may alter the sequence, but the sets of objects which support the description differ 
between cases, and neither of them is the sequence itself. 

Ordered Sequences. The predicate olisty^ asserts that a particular sequence of objects 
is ordered by (<) in its /component: 
olisty() olisty(A) 

olisty(R @{A,B) @S) = olistyr(R @(A)) a A.f < B.f a olist^((B) @5) 

The sequence of objects S which olisty^(5) asserts is ordered don’t have to be 
linked together in the heap. An assignment A.f: = E doesn’t affect the assertion just 
when Ai S . 

If we combine definitions then we have to be careful. The assertion 
olist^^(jc nil) uses both the olisty^ and the definitions to state that a 
particular linked sequence in the heap is ordered: the assertion is unaffected by 
A.tl: = E when Afx nil - because that is the condition which makes sure that 
the sequence .r nil remains the same - and unaffected by A.hd: = E in exactly 
the same circumstances, even though that assignment doesn’t alter the object- 
sequence itself. 

Cyclic Graphs. The set of nodes in a directed binary graph reachable through a 
pointer A is 

A*, ^ = if A = nil then { } else {A}u A./*, ^ u A.r*, ^ fi 

The assignment p.l: = q, given the postcondition p 5 ^ nil a ^) , where G is 

some predicate, will generate the precondition 

p ^ nil A g({p} u u p.r , ) 

We can expect our graphs to be cyclic, so the occurrences of the mapping I® p\-^ q 
in this formula will give us problems. 

In this case it is helpful to rewrite the definition to make spatial separation easier to 
establish. It’s simple to calculate reachable nodes using a directed acyclic graph 
(DAG), breaking cycles by including an exclusion set S in the definition: 

= if A = nil V A G 5 then { } else {A}u A./*, u A.r*, fi 

This automatically gives spatial separation between the root node A and the 
subgraphs generated from the child nodes A.l and A.r. Now the assignment p.l: = q. 
given the postcondition p nil a p g 5 a g(p *; ^ ^) , will generate 

p ^ nil A p g 5 A g({p} u ^ P ^ *i 9 p^,,r,su{p} ) 

When A G 5 the graph B*, ^ ^ won’t include A - that is, we have spatial separation. 
Formally we have the equivalence (provable by induction on the height of finite 
DAGs) 
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AsS ^ - ^*l,r,s) 

Since p belongs to S U {/>} , the precondition can be simplified to 
p^mlApiSA g({p} u q *,^,^su{p} ^ P-’' *i,r,su{p ] ) • 



8 Heap Reasoning and Structural Reasoning 

We can make assertions about a value represented by a heap data structure which are 
about the value itself and which don’t involve knowledge of the heap. If it’s a 
sequence we may remark about its length, its non-cyclic nature, the non-occurrence of 
0 elements in its hd components, and so on. If it’s a tree we may want to talk about its 
height, the balance between its subtrees, the ordering of its tips, and so on. If it’s a 
graph we may remark about its spanning tree, its connectedness, its colouring, and so 
on. If we are careful, ‘pure’ structural reasoning can carry much of the weight of a 
specification and proof. On the other hand, spatial separation concerns require our 
heap data structure definitions to be written to expose information about the objects 
that support particular data structures. 

It seems, therefore, that specifications ought to be written at three separate levels: 

1. Remarks about the contents of particular variables and objects. 

2. Inductive definitions of data structures, including rules which interpret non- 
hiding assertions. 

3. The specification of the problem, in terms of (1) and (2). 

We have to make our definitions carefully, both to fit our problem and to permit 
the easiest and most local forms of reasoning. The definition of fig. 5 is good for 
acyclic sequences, and makes it easy to reason about assignments near the head of a 
sequence. It’s less good (see the list merge example below) when we assign to the tail 
of a sequence. It’s not very good at all if the sequence can be cyclic, and in that case it 
seems reasonable to devise a special adf which deals easily with cycles. 

It is interesting that remarks about individual heap cells - ‘pictures of memory’ - 
seem relatively unimportant in this treatment, in contrast to Reynolds [24]. On the 
other hand, spatial separation of data structures, a clear echo of Burstall’s DNRLS [6] 
and Reynolds’ spatial conjunction, is an essential part of every proof. 



9 A Worked Example: In-Place List-Reversal 

The in-place list-reversal algorithm (fig. 10) is the lowest hurdle that a pointer- 
aliasing formalism ought to be able to jump. Burstall deals with it in [6] with 
essentially the same specification as that below. 
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{0} r: = p\ p- = ni\\ 

{P} while r ^ m\Ao q\ = r\ r: = r.tl; q.tl: = p; p: = q od {P} 

Fig. 10. The in-place list-reversal algorithm 

9.1 Definitions 

I have not written inductive definitions of functions for the most part. Instead I give 
useful facts as axioms. 

In the invariant I need append (@) and rev, and I need to be able to say that a 
particular cell-sequence 5 is a list: that is, it is finite length and has no repetitions. I 
can only reverse finite sequences. I need disjointness (rp) of sequences. 

{)@S = S 5 @0 = 5 rev() = () rev (A) = (A) 

S @{T @U) = {S @T)@U list A A list B rev(A @ B) = rev B @ rev A 

( ) @ 5 (A) if\ (B) = B list ( ) list (A) 

Srf^T=Trf^S list(S @T) = list S a list TaScT^T 

{S@T)chU = {SrhU)A{TrhU) 

For proof of termination of any list algorithm, it’s necessary to do arithmetic on 
lengths of lists. I haven’t attempted a mechanical treatment: instead I’ve appealed to 
some obvious facts. 

list S length ((A) @ 5) > length S length ((A) @S)>0 
list S 3n: (n > 0 a length S = n) 

I use the definition (fig. 5) to describe sequences in the heap. It’s possible to 
prove by induction on the length of finite sequences that 

(list(P C) A (A) (f\ B C) ^ (s ^ti9A^E C = B C) 

- a fact that is appealed to several times in the proof. 

9.2 Specification 

The algorithm is given in variable p a pointer to a t/-linked list S. 

Q = list(p =>,, nil) A p nil = S 

The invariant of the loop is that there is a tZ-linked list from p to nil, a similar but 
distinct list from r to nil, and the reverse of the r list, followed by the p list, is the 
reverse of the original input. 

^ ^ list(p nil) A list(r nil) a p nil (f\ r nil a 
(^rev(r nil) @ p nil = rev S 
The loop measure is the length of the r list. 
t = length(r nil) 
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On termination the p list is the reverse of the original input. 

R = p nil = rev S 

This specification doesn’t say that on termination the hd components of S are what 
they originally were, though it is obvious from the program that this is so. It would be 
easy to change the specification to make this point by adding hd = HD to 
precondition, invariant and postcondition, but it would be just a drop in the ocean of 
the frame problem. 



9.3 Semi-Formal Proofs 

These proofs are written backwards: postcondition first, working towards 
precondition and several substitution steps are compressed into one. The machine- 
checked proofs available from the web site [4], calculated with the aid of Jape [5], are 
less compressed. 

Initialisation. 

r: = p; p : = nil 
[substitution] 

{ list(p nil) A list(nil nil) a p nil 0 nil nil a 

rev(p nil) @ nil nil = rev S 
[replace nil nil by ( ); then straightforward sequence calculation] 

|list(p nil) A p nil = sj 



Loop Body Preserves Invariant 

q: = r;r: = r.tl; q.tl: = p\ p: = q 
[substitution] 

|list(r.fZ nil) a list(r nil) a r.tl nil rt> r nil a| 

|rev(r.t/ ^„ 9 r^p nil)@r ^„<Sr^p nil = rev SS | 

[Use r ^ nil and listfr nil) from invariant to obtain list((r) @ r.tl nil); 
thence list(r.f/ nil) and (r)r7^ r.tZ nil, and therefore r.tZ nil = 

r.tZ=>,;nil. Use r^^nil and p nil r nil to obtain 
p nil ch [{r) @ r.tl nil) ; thence (r) (f> p nil and since list(p nil), 
r nil = {r)@ p nil. Once all the assignment-induced mappings are 

eliminated, it’s a straightforward sequence calculation. ] 

{ list(p nil) A list(r nil) a p nil rh r nil a 

rev(r nil) @ p nil = rev 5 a r nil 
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{G} 

if ^ = nil w (p ^ nilAA p.hd < q.hd) then r: = p; p: = p.tl else r: = q\ q 

{P} 

while p ^ niW q ^ nil do 

if ^ = nil w (p nil aa p.hd < q.hd) then s.d : = p\ p: = p.tl else s.tl 
s : = s.tl 
od 

Fig. 11. The in-place list merge algorithm 

Loop Body Reduces Measure 

|length(r =>„ nil) < vf} 
q: = r;r: = r.tl; q.tl: = p; p: = q 
[substitution] 

|length(r.fZ nil) < vf} 

[use r-^nil and list(r=>,, nil) from precondition to obtain (r) rhi r.tl nil; 
thence r.tl nil = r.tl nil ; then straightforward sequence calculation] 

{ list(p nil) A list(r nil) a p nil rh r nil a 

rev(r nil) @ p nil = rev 5 a r nil a length(r =>,, nil) = vf 

The While Loop 

{P} while r nil do < 5 f : = r; r: = r.tl; q.tl: = p; p:= q od {R} 

Not shown: it appeals to the invariant and measure proofs above, and is otherwise 
straightforward sequence calculation. 

The Whole Algorithm 

{Q} r\ = p\ p: = nil; while r nil do < 5 : : = r; r: = r.tl; q.tl: = p; p: = q od {P} 

Trivial, given the previous proofs. 



: = q.tl fi; i: = r 



: = q\ q: = q.tl fi; 



10 An Illustration: In-Place List Merge 

The proof of in-place list reversal above is no advance on Burstall’s version [6]. In- 
place list merge (fig. 1 1) is a more challenging problem, because it works at the tail 
end of a list. The tests in the program are designed to avoid traversal of nil pointers: 
Aw B is equivalent to (if A then true else P fi) ; AaaB is equivalent to 
(if A then B else false fi) . 
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The program is given two disjoint ordered f/-linked nil-terminated lists via pointers 
p and q, at least one of which is not empty. 



Q = 



P- 

yP-- 



nil = SP Aq nil = SQa olist;jj(p nil) a olist^^((? nil) a 
nil ifi q nil A(^p^n\\\/ q=^ nil) 



Writing A ^ B for f B @{B ) , the invariant states that we can put the list 
fragment r =>), 5 before either the p list or the q list to produce an ordered result; that 
fragment joined with the two lists is a permutation of the original input; and the p and 
q lists remain disjoint. In order to prove that the result on termination is what we 
wish, it also has to assert the non-occurrence of nil in the fragment r =>), 5 and a 
linkage between the i object and the one or other of the lists. 



P = 



s@p^ni\)Ao\istf,a[r^^i s@q 
perm(r =>), j’ @ p => nil@ nil, SP @ SQ) a 

p => nil lYi <5T => nil A nil g r i a [s.tl 



nil)/ 
p V s.tl = q) 



The measure is the sum of the lengths of the p and q lists. 
t = length(p nil® ^ nil) 

On termination r points to an ordered t/-linked nil-terminated list which is a 
permutation of the original input. 

7? = olist^^(r nil) A perm(r =>,, nil,5'P@S2) 



Problems arise during the proof in showing and exploiting spatial separation 
between the input lists p nil and q nil on the one hand, and the list fragment 
r =>), 5 on the other. For example, moving the invariant backward through the 
sequence s.tl\ = p\ p: = p.tl; s: = s.tl - the course of the loop, given that the guard in 
the if-then-else-fi is true - produces several instances of the fearsome formula 
P- Given 

lister 5 ) , p^inil, pir^^^iS 



- each of which is implied by the invariant plus the if guard - then a bit of finite 
expansion, plus the fact that 

A ^ f^B^E B = A =>y B 
makes it possible to show that 
r ^ms^p P = @{p) 

The rest of the proof is straightforward manipulation. 



11 An Illustration: Graph Marking 

The Schorr- Waite algorithm is the first mountain that any formalism for pointer 
aliasing should climb. It’s dealt with semi-formally by Morris [21] and mechanically 
by Suzuki [27]; Kowaltowski [17] gives a wonderful informal proof in pictures of a 




122 



Richard Bornat 



{G} 

t ; = root, /? : = nil 

while nil aa do 

if f = nil w t.m then 
if p.c then 

q: = t,t: = p;p: = p.r\ t.r: = q /* POP * / 
else 

q\ = t\t\ = p.r, p.r '. = p.l\ p.l : = q; p.c : = true / * SWING * / 
fi 
else 

q\ = p\ p: = t\ t: = t.l; p.l\ = q\ p.m; = true; p.c ; = false / * PUSH * / 
fi 
od 

Fig. 12. The Schorr-Waite graph-marking algorithm 

tree-marking variant. Morris’s version of the algorithm, modified to permit nil 
pointers, is shown in fig. 12. 

Definitions. The stack starting at A: 

AT, r c = if A = nil then ( ) else (A) @ |if A.c then A.r else A.l fit; ^ c) fi 
The binary DAG reachable from A but containing none of the objects in set S: 

A*i,r,s = if A = nil V A G S then ( ) else (a, A.Z*, A.r*, fi 

The binary DAG reachable from A which contains only unmarked nodes and none 
of the objects in set S: 

if A = nil V A G S V A.m then ( ) 

[else (A, fi J 

Two sequences zipped together: 

ollb'O «|||o=(> 

((A) (» R) III ((e) @ s) i {(/l, B)> @(s 1 1 1 s) 

Converting sequences and DAGs to sets: 
set( ) = { } set( ) = { } 

set(A) = {A} set(A, Tl, T2) = {A} u set T1 u set T2 

set(R @ 5) = set /? u set S 
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Specification. The graph reachable from the root is iG. The whole of the heap is 
unmarked. Initial values of the control-bit mapping, the left and right subnode 
mappings, are iC, iL and iR. 

Q = rootj ^ Ij = IG A \/x: — ijc.m a c = iC a I = iL a r = iR 

Throughout the marking process the stack p is a list, and every node on the stack is 
marked. Every node in the original graph is reachable from the tip t and/or the stack. 
Unmarked nodes in the graph are reachable from the tip and/or the right sub-nodes of 
elements of the stack. If a node is marked then it’s in the graph; if it’s not marked 
then its control bit is unchanged; if it’s not on the stack then its left and right 
components are as they were originally. If a node is on the stack then we can 
reconstruct its left and right components by considering its predecessor node and its 
control bit. 



list) 


pT ;_r,c) ^ V.c;^.c G set(pT ; j ^ X.m^ A 




\ 


set| 


^ Ij ju set|p*^ r { }) ~ set /G A Vjc; (jc.ni x € set iG) a 






^jc G set iG A — ijc.nt ^ 






Vx: 


jc G j^setjf ^ j) u U{set(y.r^, ^ j) 


jCGset(pT,,^^)|j 


A 


Vjc: 


^(— iX.m ^ x.c = X.iC) A (j: ^ pT = 


X.iL A x.r = x.iRjj a 


Vjc, y: ( ^ 1 1 1 ® ^ 


\ 





l^if x.c then x.l = x.iL a y = x.iR else y = x.iL a x.r = x.iR fij J 



The measure of the loop is a triple of the number of unmarked nodes in the graph, 
the number of nodes on the stack with control bit set to true, and the length of the 
stack. 

On termination all and only the original graph is marked. Unmarked nodes have an 
unchanged control bit, and the left and right components are as they were on input. 

R = Vx: ((jc.m g set iG) a (—ix.m —> x.c = x.iC) a x.l = x.iL a x.r = x.il^ 



Proof. DAGs based on the definitions above are finite height, given a finite heap 
(provable by induction on the difference between the restriction set S and the heap). 
That permits inductive proof of all kinds of interesting equivalences, including, for 
example 

BsS^ ^ "^l.r.m.s) 

The marvellous character of the algorithm is not that it reaches every node - a 
simple recursion would do that - but that it does so without using a stack, modifying 
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nodes it has passed through and - most marvellously of all - restoring them 
afterwards. Corresponding parts of the proof are large. We have to prove in the 
SWING arm, for example, that 

p^nil, -^p.c, \/x:[x e set{^pT ^ x.m^, list{j).t\ i (p)0 p.zT , 

(— ix.m ^ x.c = x.iC) A ^ set(/?T j ^ {x.l = x.iLAX.r = x.i-R)j a 

Yy. (■^-3')eset(/’'l', ^ J||((t)@/?T, ^,))^ 

^ l^if x.c then x./ = x.iL a y = x.iR else y = x.iL a x.r = x.iR fi J 
(— ix.m ^ x.(c © p h- > true) = x.iC'j a 
fx ^ set(pT, ^ ^ 

^x.(Z © p h- > f) = X.iL A x.(r © p p.l) = X.iR) J 

(x,y) G set(pT, Jll ((p.r)@pT, ,.^^)j ^ 

Vy: fif x.(c © p true) then x.(Z © p t) = x.iL a y = x.iR ^ 

[^l^else y = x.iL a x.(r © p p.l) = x.iR fi jj 

It’s for the most part straightforward, but it’s tedious manipulation. In Burstall’s 
phrase, a great deal of work for such a simple matter. 

From the DAG definition it’s possible to prove that 

f */.r,SuM ^P'^l.r.S = t*l,r.S ^P*l,r.S = ^*l,r,S ^P* l,r,Su{t] 

That gives enough spatial separation to make it easy to prove that the nodes of the 
original graph are always reachable. It’s a little harder to show that the unmarked 
nodes are always directly reachable: we have to show when dealing with the PUSH 
arm, for example, that 

t nil,—\t.m,list{^p'li ^ 



X G set iG A —ix.m - 






uU{set(y.r**,^„,_^j)|xGpt,_,,,jj 



'^x G set iG A —ix.{m © n— > true) ■ 

( I 

cptl f 1 



p,r,m®t\-^true,[ } 



' l®t\-^p,r,c 



The proof that the stack is invariantly a list sometimes has to deal with some 
superficially frightening formulre. The SWING arm, for example, generates 

P^ l®pi->t,r®ph->p.l,c®pi->true 



but, with a single step of expansion, plus p.c from the guard and p 5^^ nil and 
list^pT; ^ j from the invariant, this simplifies to pT; ^ ^ . 
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12 Conclusion 

Burstall showed a way towards practical proofs of pointer programs which relatively 
few have followed. This paper shows that it is possible to reason in Hoare logic about 
small but moderately complicated pointer programs, using the principle of spatial 
separation and whatever data structure and other auxiliary definitions suit the 
problem. 

Burstall’s DNLRS mechanism achieved local reasoning by restricting itself to 
particular data structures and particular problems. The treatment of general data 
structures in this paper doesn’t yet approach the elegance of his solution: it substitutes 
first, tidies up second, and remains rather low-level. To make it more elegant and 
more practically useful, it will be necessary to make the substitution mechanism 
mimic the locality of assignment, dealing only with genuine potential aliases and 
ignoring those which can be dealt with by spatial separation assumptions. If we can 
do this for a wide range of problems, building on a relatively small collection of data 
structure inductions, this goal may perhaps be reached. 
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Abstract. For the purpose of program development, fairness is typi- 
cally formalized by verification rules or, alternatively, through refinement 
rules. In this paper we give an account of (weak) fairness in an algebraic 
style, extending recently proposed algebraic accounts of iterations and 
loops using the predicate transformer model of statements. 



1 Introduction 

The nondeterministic choice known from Dijkstra’s guarded commands allows 
an arbitrary selection to be made each time it is encountered, giving implemen- 
tations maximal freedom. Alternatively, we can assume that a certain degree 
of fairness is exercised for repeated choices, a useful assumption for modeling 
concurrent systems. For the purpose of program development, fairness has been 
formalized by verification rules, for example in [9,8]. More recently, fairness has 
been formally treated through refinement rules [5,14]. 

The goal of this paper is to give an account of fairness in an algebraic style. 
Doing so, we follow and extend the algebraic accounts of various forms of iter- 
ations and of loops in particular [4] , which has been used to derive transforma- 
tion rules for program development techniques like data refinement, atomicity 
refinement, reordering, and others. Our contribution is to define (weak) fairness 
algebraically and to derive elementary refinement and verification rules, using 
predicate transformers as the model of statements. 

The next section introduces the basic statements in the predicate transformer 
model and extends the discussion to various forms of iteration statements de- 
fined by fixed points. Section 3 introduced fair choice and gives basic theorems 
supporting the definition. Section 4 gives the fundamental loop theorem for loops 
with a fair choice. We conclude with some remarks in Section 5. 



2 Statements 

We assume the reader is familiar with the principles of weakest precondition 
semantics and program refinement [2,10,11]. Here we briefly review the funda- 
mentals of statements defined by predicate transformers, following the treatment 
of [3], using typed, higher-order logic. The iteration statements are based on [1,7] 
and in particular on [4]. 
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State Predicates. State predicates of type VS are functions from elements of type 
S to Bool, i.e. VS = if — > Bool. On state predicates, conjunction A, disjunction 
V, implication =J>, and negation ^ are defined by the pointwise extension of the 
corresponding operations on Bool. Likewise, universal and existential quantifi- 
cation of Pi : VS are defined by: 

{W i G I ' Pi) a = {y i G I • Pi a) 

(3i G I • Pi) a = {3 i G I • Pi a) 

The entailment ordering < is defined by universal implication. The state pred- 
icates true and false represent the universally true and false predicates, respec- 
tively. 

Predicate Transformers. Following Dijkstra, statements are defined by predicate 
transformers. As only their input-output behavior is of our interest, we identify a 
statement with its predicate transformer, i.e. we write S p rather than wp{S,p). 
Formally, predicate transformers of type A are functions from predicates 
over 17 (the postconditions) to predicates over A (the preconditions), A 
17 = Vfi — !■ VA. A predicate transformers S is called monotonic if it satisfies 
p<q^Sp<Sq for any (state) predicates p and q. We use monotonic 
predicate transformers to model statements. 

The sequential composition of predicate transformers S and T is defined by 
their functional composition: 

{S-,T)q A S{Tq) 

The guard [p] skips if p holds and establishes “miraculously” any postcondition 
if p does not hold (by blocking execution). The assertion {p} skips if p holds 
and establishes no postcondition if p does not hold (the system crashes): 

[p] q A p ^ q 

{p} q A p Aq 

We define skip = [true] = {true} as the identity predicate transformer, magic = 
[false] as the predicate transformer which always establishes any postcondition, 
and abort = {false} as the predicate transformer which always aborts. 

The demonic (nondeterministic) choice FI establishes a postcondition only if 
both alternatives do. The angelic choice U establishes a certain postcondition if 
at least one alternative does. 

(SnT) q A S q AT q 

{S U T) q A S qy T q 

Relations of type A ^ Q are functions from A to predicates over 17. The re- 
lational updates [i?] and {i?} both update the state according to relation R. 
If several final states are possible, then [i?] chooses one demonically and {i?} 
chooses one angelically. If R is of type Z\ ^ 17, then [i?] and {R} are of type 
Z\ I— >■ 17: 

[i?] p 5 A (yuj’RSuj^quj) 

{i?} q 6 A (3uj • R 6 ui A q uj) 
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The predicate transformers [p], {p}, [i?], {R} are all monotonic and the operators 
n, U preserve monotonicity. 

Other statements can be defined in terms of the above ones. For example the 
guarded statement p — > 5 is defined by [p]; S and the conditional by: 

if p then S else T = {p ^ S)r\ (^p ^ T) 

The enabledness domain (guard) of a statement S is denoted by grd S = 
~^S false and its termination domain by trm S = S true. We have that: 

grd {S n T) = grd S V grd T (1) 

grd T = true grd [S] T) = grd S (2) 

Program Variables. Typically the state space is made up of a number of program 
variables. Thus the state space is of the form Pi x . . . x Pn. States are tuples 
(xi, . . . , Xn). The variable names serve for selecting components of the state. For 
example, if x : P and y : A are the only program variables, then the assignment 
X \= e updates x and leaves y unchanged: 

X := e = [R] where R {x, y) (x', y') = (x' = e) A {y' = y) 

Refinement Ordering. The refinement ordering C is defined by universal entail- 
ment: 

SAT = {yq- S q<T q) 

With this ordering, the monotonic predicate transformers form a complete 
boolean lattice, with top magic, bottom abort, meet □, and join U. Hence 
any monotonic function / from predicate transformers to predicate transformers 
has a unique least fixed point p / and a unique greatest fixed point v f, also 
written as p x • / x and v s • f s, respectively. 

Rerations. Iteration of a statement S is described through solutions of the equa- 
tion X = S\ X r\ skip . More precisely, we define two fundamental iteration con- 
structs, the strong iteration and the weak iteration S* . We use the convention 
that ; binds tighter than □: 

5“ = {fiX • S;Xn skip) (3) 

S* = {n X • S;Xn skip) (4) 

Both define a demonically chosen number of repetitions of S. However, with S* 
the number of repetitions is always finite whereas with 5“ it can be infinite, 
which is equivalent to abortion. For example, if iS is a := a-|- 1, then the equation 
X = a ■.= a + V, X r\ skip has two solutions, abort and skip ria:=a-|-lna:= 
a -I- 2 n . . . . The least solution is given by their demonic choice. As abort r\ Q = 
abort for any Q, we have that (a := a + 1)“ = abort. The greatest solution 
is given by their angelic choice. As abort Li Q = Q for any Q, we have that 
(a := a -|- 1)* = skip ria:=a-|-lna:=a-l-2n... . 
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From the fixed point definitions we get the following laws for unfolding iter- 
ations: 



5“ = 5; 5“ n skip (5) 

S* = S]S*n skip (6) 

Both weak and strong iteration are monotonic in the sense that S Q T im- 
plies 5“ C and S* Q T*. Both S‘^ and S* are refined by S itself: 

5“ E 5 (7) 

S'* E 5 (8) 

Furthermore, from the two unfolding laws we get immediately (as S = T r\ U 



implies S Q T for any S, T) that both are refined by skip : 

S“ E skip 
S* E skip 

For the nested application of weak and strong iteration we have: 

(S“)* E S“ (11) 

(S*)* E S* (12) 

However, we note that (S“)‘^ = abort and (S*)“ = abort. Intuitively, the inner 
iteration is refined by skip , which then makes skip “ = abort . 

We introduce a derived iteration construct, the positive weak iteration S’*": 

S+ = S;S* 

Positive weak iteration is also monotonic in the sense that S Q T implies 
S’*" E T~^ , which follows from the monotonicity of weak iteration and sequential 
composition (in both arguments). Furthermore, S~^ is refined by S itself: 

S+ E S (13) 

This follows from the definition of S’*" and (10). Weak iteration can also be 
defined in terms of positive weak iteration: 

S* = 5+ n skip (14) 

This follows immediately from the unfolding law (6) and the definition of . A 
consequence of this is that S* is refined by S'^: 

S* E S+ (15) 

For the nested applications of positive weak iterations with weak iteration and 
strong iteration we get: 

(5+)* = 

(5+)“ = 



(9) 

( 10 ) 



S* 

r*o; 



(16) 

(17) 
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We show the first one by mutual refinement: (b’"*')* C S* holds by (13) and 
monotonicity of weak iteration. For the refinement S* C we note that the 

left side is equal to {S*)* by (12), hence this is implied by (15). 

For the guards of the iteration constructs we get: 



grd S“ 


= true 


(18) 


grd S* 


= true 


(19) 


grd S"*" 


= grd S 


(20) 



The first two follow immediately from the unfolding laws (5) and 6) as grd skip = 
true. The last one follows easily from the definition of S~^, (19) and (2). 

The loop do 6” od executes its body as long as it is enabled, possibly not 
terminating (i.e. aborting). This is formally expressed as a strong iteration, fol- 
lowed by a guard statement which ensures that the guard of the body will not 
hold at exit: 

do 5 od = 5“; h grd S] 

The while loop while b do S can then be defined as do b S od, provided S 
is always enabled. 

3 Fair Choice 

Fairness is associated with the process of repeated selection between alternatives; 
we restrict our attention to two alternative statements, say S and T, in loops of 
the form do SOT od : if 5 or T is continuously enabled, respectively, it will 
be eventually taken, a criterion known as weak fairness (e.g. [9]). By contrast, 
in the loop do Sr\ T od on each iteration an arbitrary choice between S and T 
is made. 

Following the spirit of the approach, we define the fair choice S O T in iso- 
lation, such that the meaning of do SOT od is defined in a compositional 
manner. First we introduce an operator S, read “try 5”’ , for a predicate trans- 
former 5. If S' is enabled, S behaves as S, otherwise as skip : 

S = S n grd S] 

In the fair choice between S and T we may take S or T arbitrary but finitely 
often, such that we can give T and S a chance, respectively. We express this in 
terms of positive weak iterations: 

SOT = S+;THT+;S 

We now consider the construct S <\ T which guarantees fairness only for S, and 
dually the construct S \> T which guarantees fairness only for T : 



S <1 T 
S>T 



S+n(T+;S) 
(S+; f) n T+ 
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For reasons of symmetry, we continue only with S <\ T. We state some facts 
about these operators for supporting our confidence. To this end, we introduce 
one further operator, SOT, for the repeated execution of either S or T: 

SOT = 5+ n T+ 

We notice that S O T, S <l T, and SOT are enabled if either T is enabled 
or S is enabled: 

grd {SOT) = grd S W grd T (21) 

grd {S <\ T) = grd S V grd T (22) 

grd {SOT) = grd S V grd T (23) 

We prove only the first of these claims as the proofs of the others are similar: 

grd {S O T) 

= (Definition of O) 

grd {S+-,TO T+;S) 

= ((1), definition of 5+) 

grd{S-S*-f)y grd{T- T*-~S) 

= {grd S = true for any S, (2), (19)) 
grd S V grd T 

In a loop containing the alternatives S and T, there is no difference whether 
they are executed exactly once at each iteration, or several times: 

Theorem 1. Let S and T be monotonic predicate transformers. Then: 

do 5 n T od = do SOT od 

Proof. According to the definition of do S od we have to show: 

{SOT)‘^;[-n grd {SOT)] = {S O T)‘^ ; grd {S O T)] 

By (1) and (23) we have that grd {SOT) = grd Sy grd T = grd {S □ T). Hence 
it is sufficient to show that {S □ T)“ = (5 □ T)“ which, by the definition of □ , 
is equivalent to {S □ T)“ = {S~^ □ T+)“. We show this by mutual refinement: 

{SO T)“ C {S+ n T+)“ 

= (( 17 )) 

{{SO T)+)“ C {S+ n T+)“ 

<;= (monotonicity of 5“) 

{SO T)+ OS+OT+ 

= (lattice property) 

{{S n T)+ O s+) A {{S n T)+ O T+) 

<J= (monotonicity of 6'+) 

{SO T O S) A {SO T O T) 



The last line follows from the lattice structure. The reverse refinement (b’’*' □ 
T+)i^ E (iS" n T)“ follows immediately from (13) and the monotonicity of strong 
iteration. 
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A loop with a demonic choice between two alternatives is refined by the same 
loop with a choice which is fair to one of the alternatives: 

Theorem 2. Let S and T he monotonic predicate transformers. Then: 

do S n T od do S <\ T od 

Proof. Applying Theorem 1 we calculate: 

do 5 □ T od C do S <\ T od 

= (definition of do S od ) 

{S □ T)“; h grd {S □ T)] C (5 < T)-; h W (5 < T)] 

<;= (as grd {S O T) = grd {S < T) by (23) and (22), monotonicity) 

{S □ T)“ E (5 < TY 

= (( 17 )) 

((5 □ T)+Y E (5 < TY 
(monotonicity of S^) 

{SUT)+ \ZS <T 
(definition of □ , <1 ) 

{S+UT+Y QS+UT+YS 

= (lattice property) 

((5+ n T+)+ E s+) A ((5+ n T+)+ E T+-_S) 

((13), monotonicity of 6'+, definition of S) 

{S+ n T+ E S+) A {{S+ n T+Y C T+; {S □ W S])) 

<^= (lattice property, ; distributes over □) 

(5+ n T+Y QT+;Sn T +- h grd S] 

(lattice property, skip E [pl for any p, S; skip = S for any S) 

{{S+ n T+Y C T+; 5) A ((5+ □ T+Y ^ T+) 

(definition of S+, (13), lattice property) 

(5+n T+);(5+n T+Y E T+-S 

(monotonicity of ;) 

{S+ n T+ E T+) A {{S+ n T+Y E S) 

(lattice property, (8)) 

5+ n T+ E 5 

The last line follows from (13) and the lattice structure. 

A loop with a demonic choice between two alternatives is also refined by the 
same loop with a fair choice between the two alternatives. The proof is similar 
to the previous one: 

Theorem 3. Let S and T be monotonic predicate transformers. Then: 

do 5 n T od E do 5 O T od 

Let us now consider how to implement the loop do S <l T od . One way is to 
use a round robin scheduler: first we test S; if it is enabled, we execute it and 
continue with T, otherwise we continue immediately with T. If T is enabled. 
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we execute it. Otherwise, if T is not enabled but S was enabled we start over 
again. If S was also not enabled, the loop terminates. This is expressed by the 
loop do 5 :: T od, where S :: T is read “S then T”: 

S::T = S;fn[^grdS];T 

For example, a loop with a choice which is fair to one alternative can be imple- 
mented by a round robin scheduler: 

Theorem 4. Let S and T be monotonic predicate transformers. Then: 

do S <\ T od C do S :: T od 

Proof. We note that from the definitions we get grd {S :: T) = grd S V grd T . 
We calculate: 

do 5 <1 T od do S :: T od 
= (definition of do 5 od) 

{S < T)“; h grd (S < T)] C (5 :: T)“; h grd (S :: T)] 

4= (as grd {S :: T) = grd S V grd T, monotonicity of ;) 

(S <\ T)“ C {S :: T)“ 

4= (monotonicity of S‘^) 

S <\T \ZS :: T 
4= (definition of <, ::) 

S+;fnT+QS]fn h grd 5]; T 
4= ((13), monotonicity of ; and n) 

T+ E [- grd s]- T 

The last line follows from (13) and skip E [p] for any p. 

Alternatively, we can implement the loop do S <\ T od by a prioritizing sched- 
uler: whenever S is enabled, that is executed. Only if S is not enabled, T is 
tested and executed if enabled. Otherwise, if neither S nor T are enabled, the 
loop terminates. This is expressed by the loop do 5 // T od, where S // T, read 
“S else T”, is the prioritized composition as studied in [13]: 

s // T = S n [-n grd S]; T 

Using prioritized composition, we can alternatively define S T hy S\T // T. 
Theorem 5. Let S and T be monotonic predicate transformers. Then: 

do S < T od E do S If T od 



The proof is similar to the previous one. 
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4 Verification of Loops with Fair Choice 

For the purpose of verifying properties of loops we make use of ranked predicates. 
Let IF be a well-founded set and let P = {p^ | w S FF} be an indexed collection 
of predicates, called ranked predicates. Typically, a ranked predicate Pw is the 
conjunction of an invariant I and a variant t, i.e. Pw = I A {t = w). 

For a given set P as above we define p = {3 w G W • Pu,) to be true if some 
ranked predicate is true and with FF<i„ = {?;€ lF|t;<w}we define p<w = 
(3v G W<:w • py) to be true if a predicate with lower rank than py, is true. 
The following properties are taken from [3]. Let ^ be a monotonic predicate 
transformer and {vy, \ w G W} a, collection of ranked predicates: 

(y w G W ' ry, < S r^y,) r < r (24) 

By contrast, for the verification of weak iteration we do not have to incorporate 
a termination argument: 



P<Sp => p<S*p (25) 

We first state the basic loop verification theorem, from which we then derive a 
verification theorem for loops with a demonic choice and finally for loops with 
a fair choice. 

Theorem 6. Let S be a monotonic predicate transformer and {ry, | ?/; G W'\ a 
collection of ranked predicates. Then: 

iy w G W • ryj < S r^yj) => r < do 5 od (r A ^ grd S) 

Proof. Assuming (y w G W • ry, < S r^w) we calculate: 

do 5 od (r A ^ grd S) 

= (definition of do S od) 

(5“; grd 5]) (r A ^ grd S) 

= (definitions of guard and ;) 

grd 5 => r A ^ grd S) 

= (logic) 

5“ {grd SV r) 

> (monotonicity of 5) 

r 

> (assumption, (24)) 



The assumption (V w G W ' ry, < S r^y,) expresses that statement S preserves 
the invariant part of while making progress towards termination by establish- 
ing a postcondition of lower rank. If the loop contains a demonic choice between 
two alternatives, then each of the alternatives has to preserve the invariant and 
make progress towards termination: 
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Theorem 7. Let S be a monotonic predicate transformer and { | w € W} a 
collection of ranked predicates. Let g = grd S V grd T. Then: 

(tu, < S r<„) A {tw < T r < u ,) r < do S U T od (r A —-g) 



Proof. According to (1) and Theorem 6 it is sufficient to show {\/ w G FE * < 

{S n T) r<^). For this, we calculate for any w G W: 

rw <{Sn T) r<^ 

= (definition of n) 

^ ^ ’^<cw t\ T r^yj 
= (logic) 

( '^W ^ S liJ ) ( '^w ^ ^ w ) 

The last line is exactly the assumption of the theorem. 



Now we consider the loop do 5 <1 T od . In order to ensure termination of this 
loop, it is sufficient that S makes progress towards termination by establishing 
a postcondition with lower rank than the precondition, and that T either keeps 
the rank of the postcondition the same as the precondition but enables S (or 
keeps S enabled if it was) or otherwise also decreases the rank. In the first case 
the fairness of S ensures that it will be eventually taken, thus making progress 
towards termination. 



Theorem 8. Let S be a monotonic predicate transformer and {ry, \ w G W} a 
collection of ranked predicates. Let g = grd S V grd T. Then: 



{ry, < S r^yj) A {ryj < T{{ryj A grd S) V r<.^,)) ^ 

r < do 5 <1 T od (r A ~^g) 



We note that r^y, < (ry, A grd S) V r^y,. Hence this assumption is indeed weaker 
than that of Theorem 7. 



Proof. First we observe that from < 5 we get < S as r^yy < r^. 
By (25) this implies r<.u, < S* r<^. As (r^ A grd S) V < ry; we get similarly 
(ryj A grd S) V r<.u, < T* {{ryj A grd S) V r<^). Next we note that according to 
(22) and Theorem 6 it is sufficient to show (V w € W • ry, < {S <i T) r<u,). For 
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this, we calculate for any w & W\ 

rw <{S <\ T) r<^ 

= (definition of Oand □) 

rw < r^w A (T+; S) 

= (definition of and ;) 

rw < S{S* r^w) A T+(S r<„) 

= (logic) 

{tw < S{S* r^w)) A {tw < T+{S r<^)) 

-^= (as r^w < S* r^w, monotonicity of S) 

{rw < S r^w) A {rw < T+ {S r<^)) 

-^= (assumption rw < S r^w, definition of S) 
rw < T+{{S n grd 5]) r<^) 

= (definition of □ and [p]) 

rw < T+{S r^w A (^ grd S ^ r<u,)) 

-^= (assumption < S r^w, monotonicity of T+, logic) 

rw < T~^{rw A{grd S V r<„)) 

= (logic, r^w <rw) 

rw < T+{{rw A grd S) V r<^) 

= (definition of T+, ;) 

rw < T{T*{{rw A grd S) V r<^)) 

= (as {rw A grd S) V r<^ < T* {{rw A grd S) V r<^), (25)) 
rw < T{{rw A grd S) V r^w) 

The last line is just the second part of the assumption of the theorem. 

For reasons of symmetry, we get an analogous theorem for do S t> T od . Fur- 
thermore, with a similar proof, we get following theorem: 

Theorem 9. Let S be a monotonic predicate transformer and {rw | w G W{ a 
collection of ranked predicates. Let g = grd S V grd T. Then: 

rw < S r^w A rw < T{{rw A grd S) V r^w) 
r < do S O T od (r A ~^g) 

The assumption here is the same as that of Theorem 8. It cannot be weakened 
to allow 5, analogously to T, either to make progress towards termination or 
to enable T. For example, if both S and T disabled themselves but enable 
each other, S and T would be selected alternatively and no progress towards 
termination would be enforced. 

5 Discussion 



In [12] Dijkstra’s calculus is extended by a fair choice operator. The approach 
relies on temporal predicate transformers like “always” and “eventually” and 
on syntactic substitutions of fair choice by angelic choice, neither of which is 
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needed here. Interestingly, the resulting notion of (weak) fairness is stronger 
than ours: For a loop to terminate it is already sufficient if any one alternative 
makes progress towards termination, whereas we require that one alternative 
must always make progress. On the other hand, a round robin scheduler is a 
sound implementation technique here, but not in [12]. 

In [6] Dijkstra’s calculus is also extended by a fair choice operator. Although 
the definition of do SOT od looks textually similar, the differences are sub- 
tle but substantial: the iteration is not defined by the greatest fixed point, 
but in terms of the dovetail operator V that models fair parallel execution. 
The definition of dovetail requires the distinction between possible and definite 
nontermination, which is done by additionally considering weakest liberal pre- 
conditions. The expressiveness of the dovetail operator leads to problems with 
non-monotonicity and to the need for two ordering relations, both of which is 
avoided here. 

In [15] fairness in action systems, a generalization of loops allowing possibly 
infinite computations, with predicate transformer semantics is considered. The 
approach there is that unfair non-terminating sequences are explicitly specified, 
rather than fairness expressed by a fair choice as elsewhere. This allows a wider 
range of fairness constraints to be expressed compared to the (weak) fairness 
considered here, though in a different style. 

A number of issues found elsewhere have not been treated here. First, The- 
orem 8 can be weakened by allowing the fair statement S to be “helpful” by 
decreasing the rank of the precondition only in some states. Such a theorem 
then makes use of additional invariants characterizing the helpful states [5] . Sec- 
ondly, we have not considered fairness with more than two statements. Finally, 
we have not considered other forms of fairness like strong fairness and uncondi- 
tional fairness [9] . It would be particularly interesting to see whether other forms 
of fairness can be treated similarly in an algebraic style. 

It is worth pointing out that all of our results have been derived without the 
common assumption of conjunctivity. A predicate transformer S and is called 
(positively) conjunctive it is satisfies for any set of predicates qi, i G I • qi) = 
(y i G I ' S qi) where / is a nonempty set. Conjunctivity rules out angelic non- 
determinism: the predicate transformers [p], {p}, [i?] are all conjunctive (but not 
{R}) and the operators ;, □ preserve conjunctivity (but not U). The combina- 
tion of demonic and angelic nondeterminism has recently led to a game-theoretic 
view of predicate transformers [3] , with applications to the modeling of interac- 
tive systems. Thus our results about fairness carry over to this general setting. 
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Abstract. This paper sets out a programme of work in the area of de- 
pendability. The research is to be pursued under the aegis of a six-year 
Inter-Disciplinary Research Collaboration funded by the UK Engineer- 
ing and Physical Sciences Research Council. The aim is to to consider 
computer-based systems which comprise humans as well as hardware and 
software. The aim here is to indicate how formal methods ideas, coupled 
with structuring proposals, can help address a problem which clearly also 
requires social science input. 



Extended Abstract 

1 Reasoning about Interference 

This section summarises earlier work on formal development methods for con- 
current systems. 

The essence of concurrency is interference: shared-variable programs must 
be designed so as to tolerate state changes; communication-based concurrency 
shifts the interference to that from messages. One possible way of specifying 
interference is to use rely /guarantee-conditions (see [7,16,17,19,2,4]). 

Programming language designers have proposed a series of increasingly so- 
phisticated constructs to control interference; the case for using object-oriented 
constructs is set out in [8] . 

2 Faults as Interference 

The essence of this section is to argue that faults can be viewed as interference in 
the same way that concurrent processes bring about changes beyond the control 
of the process whose specification and design are being considered. Without yet 
proposing notation for each case, a range of motivating examples are considered. 

The first example is one that re-awakened this author’s interest in considering 
faults as interference. Faced with the task of specifying a traffic light system, 
many computer scientists would confine themselves to the control system and 
specify the signals which must be emitted. Michael Jackson (see [6]) considers the 
wider issues of the correct wiring of the control system to the physical lights and 
the initial state of these lights units. One could widen the specification to address 
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the overall light system (at one level the requirement is that at least one light 
must always be red) and record assumptions (as rely-conditions) which state that 
emitting a signal implies that the light unit changes state. Recording such a rely- 
condition does not itself result in a dependable system but it ensures that the 
assumptions are recorded and use of proof rules for development of concurrency 
should ensure that there are no further hidden assumptions. In fact, one could 
take this example further by specifying that the real requirement is to reduce 
the probability of a crash to a certain level and then record probabilities that 
drivers behave in certain ways when faced with red lights (see, for example, [10] 
for ways of reasoning about probabilities in design) . 

A second -trivial- example should again illustrate the shift of view in doc- 
umenting assumptions. Rather than specifying a control system in terms of 
the readings delivered by measuring devices, it might be preferable to spec- 
ify the overall system in terms of the actual temperature etc. and provide a 
rely-condition which records the acceptable tolerance on the measuring device. 
Here again, the message is to expose the assumptions. 

A more realistic example can be given in the same domain: it would be 
common for such sensors to be deployed using “triple modular redundancy”. 
The viewpoint of recording the assumptions would suggest that a rely-condition 
should state that two roughly equal measurements are far less likely to be in 
error than one which is wildly different (or is perhaps some distinguished error 
value) . 

As well as the primary message that exposing assumptions will force their 
consideration, there is the clear advantage that checking that such rely-conditions 
are adequate to prove that a system will meet its overall specification will check 
for any missed assumptions. 

3 Fault Containment and Recovery 

Significant work has been done on designing architectures for fault-containment 
and recovery - see for example [11,18]. 

4 Human Errors and Their Containment 

The work in the Dependability Interdisciplinary Research Collaboration on which 
we are embarking will address not just dependable computer systems but will 
also consider wider systems where the role of the humans involved is seen as 
critical to overall system dependability. The need for this is emphasized by [9] 
which reports a large number of computer related accidents which resulted in 
death and notes that in the majority of cases the key problem related more to the 
interaction between people and computers than a specific hardware or software 
malfunction. 

There are of course many examples of where a program tries to guard against 
inadvertent errors of its users: the check in many operating systems asking a 
user to confirm the request to delete files or the need to retype a new password 
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(being invisible there is no other check) are trivial instances. More interesting is 
the architecture of the overall system known as Pen& Pad [5] in which software 
is programmed to warn against possible misprescription of drugs by doctors: no 
attempt was made to automate prescription but the system would check against 
dangerous cocktails or specific drugs which might not be tolerable to some other 
condition that is indicated on the patient’s record. 

The logical extension of the work outlined in the two preceding sections on 
purely computer systems is to aim for a more systematic treatment of human 
errors. Fortunately the work of psychiatrists like Reason (see [12]) in categorising 
human errors offers the hope of describing and reasoning about the sort of human 
errors against which a system is designed to guard. The objective would be to 
minimise the risk of the errors of the computer system and (groups of) humans 
“lining up” in the way indicated in [13]. 



5 Further Research 

There are many further areas of research related to the themes above. For ex- 
ample: 

— Both pre and rely-conditions can record assumptions but if they become 
complex they might be a warning that an interface has become too messy 
(cf. [1]) - ways of evaluating interfaces and architectures are needed (see [15]). 

~ The idea of using rely-conditions to record failure assumptions occurred to 
the author in a connection with a control system some years ago. One reason 
for not describing the idea more publicly was that there often appears to be 
a mismatch of abstraction levels between the specification and the error 
inducing level. There needs to be more research on whether this can be 
avoided. 

— The role of malicious attacks is being considered in the IST-funded MAFTIA 
project. 

~ A key area of system “misuse” is where the user has an incorrect model of 
what is going on inside the combined control/controlled system - minimizing 
this risk must be an objective. 

— Progress in modelling the human mind (e.g. [3]) should be tracked. 
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Abstract. Traditional rules for refinement of abstract data types sug- 
gest a software development process in which much of the detail has to be 
present already in the initial specification. In particular, the set of avail- 
able operations and their interfaces need to be fixed. In contrast, many 
formal and informal software development methods rely on changes of 
granularity and require introduction of detail in a gradual way during 
the development process. 

This paper discusses several generalisations and extensions of the tradi- 
tional refinement rules, which are compatible with each other and, more 
importantly, with the semantic grounding of data refinement. Together 
they should provide a semantic justification for a larger spectrum of de- 
velopment steps. 

The discussion takes place in the context of the formal specification lan- 
guage Z and its relational underpinnings. 

Keywords: Refinement, formal methods, Z. 



1 Introduction 

The theory of data refinement as described in essence by He, Hoare and 
Sanders [17] (summarised in Section 2) provides a useful model of program devel- 
opment in terms of abstract data types for which the set of operations is known 
already. In the working out of this theory, the abstract data types (ADTs) were 
assumed to have identical sets of operations, i.e. they were conformal. The rep- 
resentations of this theory in languages like Z [24] have until recently further 
restricted this context. The emphasis in the early work on refinement in Z was 
on the incomplete downward simulation rule, forbidding postponement of non- 
determinism; and it was implied that input and output were immutable (i.e. in 
a refinement step, inputs and outputs could not change). Recent work [27,25,5] 
has relaxed these unnecessary restrictions by more fully exploiting the theory 
of [17]. In this paper we discuss how these and some other generalisations of 
data refinement relate to each other and how they can be interpreted in terms 
of the relational model. In particular it is shown how they represent a number 
of liberalisations of the notion of conformity of data types, and how these liber- 
alisations are compatible with each other. The comparison between ADTs and 
process algebra proves to be a recurrent theme which is both inspirational and 
sometimes confusing. 
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Our motivation for this work is twofold. The primary motivation stems from 
using a language like Z for “partial specification”, as proposed in e.g. [2,28,6]. 
In a partial specification approach, a system is not described by a single spec- 
ification, but by a collection of interlocking ones, possibly written in different 
languages, each describing a different aspect. (Note that this situation also arises 
when one uses a notation like UML.) Combination of partial specifications is by 
construction of a common refinement, whose existence witnesses to the consis- 
tency of the partial specifications. From a methodological point of view, it is 
essential that partial specifications may mention only aspects of the system of 
relevance to their particular viewpoint. For example, only part of the external 
interface for an operation may appear in a given viewpoint, and we require that 
the common refinement can add to, or alter, this interface (i.e. the 10 of an 
operation). This requires a more liberal notion of refinement. 

Alternatively, one could view this work as an attempt to provide a semantic 
justification to a larger number of rigorous, intuitively “correct”, steps in pro- 
gram development. Such steps include modifications of inputs and outputs, and 
changes of granularity of operations. 

This paper is organised as follows. Section 2 summarises the data refinement 
theory as presented in [17], and the traditional [24] representation of data refine- 
ment in Z. Readers who are familiar with the area might prefer to skip Section 2 
on a first reading, and look at the motivating example in Section 3 first. Section 3 
presents a sample ADT in Z, and considers a number of desirable refinements. 
Subsequent sections present generalised refinement rules and sketches how they 
are derived from the theory or generalisations of it. The final section contains 
some conclusions. 

2 Data Refinement: The Relational View 

In this section, we first summarise the definitions of simulation and refinement 
as presented for total relations by He, Hoare and Sanders [17]. Then we discuss 
how these definitions can be modified to partial relations. Finally we give the 
traditional presentations of these rules in Z, indicating how input and output 
can be represented. 

2.1 Data Refinement Refined 

A seminal paper in the area of data refinement is the paper by He, Hoare and 
Sanders in ESOP’86 [17], following on from Hoare’s earlier paper [18]. It presents 
conditions for data refinement using upward and downward simulations, remov- 
ing some of the conditions imposed on such simulations in the VDM litera- 
ture [19]. We paraphrase their main results below. 

We assume the existence of some global data space G. 

Definition 1 (Data type). A data typeis a quadruple {S,Init, {Opi}i^i, Fin). 
The operations {Opi}, indexed hy i G I, are total relations on the set 5; Init is 
a relation from G to S] Fin is a relation from S to G. □ 
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(We omit the extra requirement on so-called conditions here; these are less central 
once we move to partial relations.) 

Data types can be compared if they are conformal: 

Definition 2 (Conformal). Two data types are conformal if their global data 
space and the indexing sets of their operations coincide. □ 

For the rest of this section, assume that all data types considered are conformal, 
using some fixed index set I. 

The comparison between data types is in terms of programs using them - 
i.e., these are black box data types. 

Definition 3 (Complete program). A complete program over a data type 
(5, Init, {Opijig/, Fin) is an expression of the form Init g P % Fin, where P, a 
relation over S, is a program over {Opi\i^i. □ 

The notion of a program may vary, but should allow at least all operations 
and be closed under sequential composition. In [17] Dijkstra’s guarded command 
language is used as a programming language; in this paper we will concentrate 
on a minimal language where every program consists of a finite sequence of oper- 
ations. (This is not a fundamental restriction as any control flow including loops 
can be represented this way by making the program counter a state variable.) 
Using the now traditional notation, for a sequence p over I , and data type X , 
px denotes the complete program over X characterised by p. (E.g. if A = 
{S ,Init,{Opi, Op 2 , Opz},Fin) then [l,3,l]x denotes Init%Opi%Op^%Opi%Fin.) 
Refinement between data types is then defined by quantification over all pro- 
grams using them: 

Definition 4 (Data refinement). For data types A and C , C refines A iff for 
each finite sequence p over I, pc Q PA- ^ 

This is not an easy property to verify as it quantifies over all finite programs. 

2.2 Upward and Downward Simulations 

Simulations provide a method of checking data refinement in a stepwise fashion, 
i.e. by comparing concrete and abstract data types per operation (and initiali- 
sation and finalisation). 

Definitions (Simulations). Assume data types A = (AS,AI,iAOpi}ici, 
AF) and C = {CS, C/,{COpJie/, CF). 

A downward simulation is a relation R from AS to CS satisfying 

Cl CAI%R 
RlCF CAF 

y i : I • i? i COpi C AOpi § R 

This is also called a forward simulation, or L-simulation [9]. 
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An upward simulation is a finitary relation T from CS to AS such that 

Cl %T CAI 
CF <ZT I AF 

Wi: I • COp^ i T C T%AOp^ 

This is also called a backward simulation, or L“ ^-simulation. Simulations are 
sometimes called retrieve relations. □ 

By structural induction on the programs it can be proved that the existence of 
either of these simulations is sufficient for data refinement. Moreover: 

Theorem 1. For finitary datatypes, upward and downward simulation are 
jointly complete, i.e. any data refinement can be proved by a combination of 
simulations. 

This is proved by first constructing a deterministic equivalent of the “abstract” 
type, using a powerset construction (this is an upward simulation). A downward 
simulation can then be found to the “concrete” type. 

For canonical data types, i.e. those where all operations are functions, upward 
and downward simulation coincide. Their difference (and incompleteness) is usu- 
ally exemplified by data types where the composition of two operations (which 
are constrained to always occur in sequence) contains non-determinism. In that 
case, downward simulation allows the non-determinism to be moved only to the 
first of the two operations and upward simulation allows for the non-determinism 
to be postponed to the second, but not vice versa. 

A single complete rule can be given in the context of predicate transform- 
ers [16]. The Z version of this, using powersimulations, is given in [10]. 

2.3 Dealing with Partial Relations 

The rules given above provide a data refinement relation for ADTs whose op- 
erations are total relations. However, in languages like Z, operations are often 
specified by partial relations. 

There are two fundamentally different approaches to partial operations in 
the world of state based systems. They are: 

— the “contract” approach to ADTs: the domain (precondition) of an operation 
describes the area within which the operation should be guaranteed to deliver 
a well-defined result, as specified by the relation. Outside that domain, the 
operation may be applied, but may return any value, even an undefined 
one (modelling e.g. non-termination). This represents a situation in which 
the customer employs a black box data type, to be used in their system in 
situations only when the precondition holds. 

~ the “behavioural” approach to ADTs: operations may not be applied outside 
their precondition; doing so anyway leads to an undefined result. This repre- 
sents the situation where the customer employs an independently operating 
machine, the operations of the ADT corresponding to possible interactions 
between the system and its environment. 
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The first approach is the more common one in Z. The second approach is used 
in the object-oriented variant Object-Z [22], and in some distributed systems 
oriented applications in Z (“firing conditions” [26]). Clearly this approach takes 
its inspiration from process algebra, however it is unclear to which extent this 
analogy is complete. Process algebra tends to deal with potentially infinite traces, 
whereas the derivation of data refinement takes only finite programs into account. 
Attempts to integrate Z with CSP have obviously also used the firing condition 
interpretation of preconditions [23,15,7]. 

In either approach, data refinement rules for partial operations are derived 
from the rules above by modelling partial operations by total operations in a 
domain enriched with a T value. Details of these derivations can be found in [27] 
for the “contract” approach to ADTs, both upward and downward simulations; 
and in [7] for downward simulations in the behavioural approach. We summarise 
these results here. 

Let X± = A U {T} for any X and distinguished element X ^ X. 

Definition 6 (Totalisation). For a partial relation Op on State, its total- 
isation is a total relation Op on State±, defined by Op = Op U {{x,y) € 
State± X State± \ x ^ dom Op} in the “contract” approach to ADTs, or by 
Op = Op U {(a;,T) € State± x States \ x ^ dom Op} in the behavioural ap- 
proach to ADTs. □ 

To preserve undefinedness, simulation relations of particular forms only are con- 
sidered, viz. those which relate T to every other value, or T to T, depending on 
the ADT interpretation (“lifting” in [27]). 

The resulting refinement rules for totalised relations can then be simplified 
to remove any reference to T. Here, we give only the refinement rules as they are 
derived for downward simulation. For the upward simulation rules, see [27,9]. 

Definition 7 (Downward simulation for partial relations). Assume data 
types A = (AS, AI, {AOpiji^/, AF) and C = (OS, Cl, {COpiji^/, OF) where 
the operations may be partial. Assume the “contract” interpretation of ADTs. 

A downward simulation is a relation R from AS to CS satisfying 

Cl C All R 
R%CF CAF 

y i \ I • (dom AOpi <l R) 1 COpi C AOpi | R 

y i ■. I • ran(dom AOpi < R) Q dom COpi 

where A <\ R = {(x,y) ^ R \ x A}. The latter two conditions are commonly 
referred to as “correctness” and “applicability”. The inequalities in these con- 
ditions allow strengthening of the postcondition (reduction of non-determinism) 
and weakening of the precondition (increasing termination), respectively. In the 
“behavioural” interpretation of ADTs, the latter is not possible, i.e. applicabil- 
ity is strengthened to: 

ran(dom AOpi < R) = dom COpi 



□ 
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2.4 Traditional Z Refinement 

Before we can give the Z presentations of these rules, we need to introduce 
schemas and in particular the schema “calculus” , essentially a predicate calculus 
over labelled tuples. 

A schema in Z has the following shape: 

ASchema 

Declarations 

predicate 



The Declarations consist of a list of items of the form name : Set, and possibly 
names of other schemas, possibly with decorations (cf. below). The predicate is 
a predicate on the names introduced in the declarations (which can be omitted 
if it is true). Inclusion of another schema means inclusion of all its names (type 
clashes for any of these names should be avoided) in the declarations, and con- 
junction of its predicate to the including schema’s predicate. The conjunction 
of two schemas is an anonymous schema that includes both sets of declarations 
and has the conjunction of the predicates; the disjunction of two schemas has 
the declarations of both, and the disjunction of their predicates. Quantifications 
over schemas have the obvious meaning: if ASchema is as defined above, 

V ASchema • P = V Declarations • predicate P 

3 ASchema • P = 3 Declarations • predicate A P 

Given a schema ASchema as above, the schema ASchema' denotes the schema 
where all its names have been decorated with a '. A primed schema like that 
conventionally denotes the after state of an operation. The schema AASchema 
denotes ASchema A ASchema' ; the schema AASchema denotes AASchema ex- 
tended with a predicate asserting that name = name' for all names in ASchema. 
Inputs to operations have names ending in ?, outputs have names ending in !; 
?Op and I Op denote schemas containing only the inputs and outputs of Op, re- 
spectively. Schemas can be used as predicates in contexts where all their names 
have compatible declarations, in which case they denote their predicate plus all 
restrictions that follow from their declarations. 

The common “states and operations” style of specification used in most text- 
books on Z represents an ADT that is not the 4-tuple datatype of Definition 1: 
it has no finalisation, no reference to global state, and operations are not defined 
on the state only, but also on inputs and outputs. Let us call this the “Standard 
Z ADT” . For simplicity, let us assume all operations have the same type of input 
and the same type of output. (This could be ensured by using sum types, where 
the summands are singleton types for absent input or output.) 

Definition 8 (Standard Z ADT). A standard Z ADT is a triple 

(State, Init, { Opi}i^i) such that 0 C Init C State, and Opi are relations between 
State X Input and State x Output for particular types Input and Output. Each 
of these components is represented by a schema. 
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Such a standard Z ADT represents the data type (in the sense of Definition 1) 
{State‘\Init'’\{Opi^}i^i,Fin) whose components are defined below. Note that 
the global state G = seq Input x seq Output. Variables in and is denote (se- 
quences of) Input, o and os denote (sequences of) Output. 

State” = State x seq Input x seq Output 
Init” = {((is, os), {s, is,{)) | s € Init} 

Op” = {(s, <in>^ is, os), {s', is, os ^ <o>) \ {{s, in), {s', o)) G Opi} 
Fin” = {((s, is, os), {{), os)) \ s G State} 

□ 

In other words, the global state (representing what is observable about the 
ADT’s behaviour) consists only of a sequence of inputs (to be consumed) and a 
sequence of outputs (to be produced) . Initialisation involves initialisation of the 
state plus copying of the input sequence from the global state; in the finalisation 
the output sequence is copied back to the global state. Note that these denote 
“complete” programs, not to be composed with other programs, which justifies 
initialisation discarding all previous output, and finalisation discarding remain- 
ing input. Observe that the input and output sequences are part of the global 
state, and that thus standard Z ADTs can only be conformal if their operations 
have identical inputs and outputs. 

Woodcock and Davies [27] present a detailed derivation of the upward and 
downward simulation rules for standard Z ADTs, by “unwinding” the rules from 
the represented (4-tuple) data type. The simulations used are of a particular 
shape, viz. only those which leave the input and output sequence (components 
of both abstract and concrete state) unchanged. In the case of downward sim- 
ulation, the refinement conditions for finalisation reduce to true; for upward 
simulation they reduce to totality (on the abstract state) of the simulation. This 
derivation proves the soundness of the standard Z (downward simulation) re- 
finement rule, as presented in Spivey’s standard reference [24] and nearly every 
other Z textbook. The translation from relations and sets to schemas should be 
obvious. The precondition of a schema plays the role of the relation’s domain. 

Definition 9 (Precondition). The precondition of a Z operation schema Op 
on state 5 is a schema on S and the input of Op obtained by hiding (existentially 
quantifying) the after state {S') and output of Op. 

pre Op = 3 S'-, \Op • Op 

Definition 10 (Standard Z refinement). The standard Z ADT 

{AS,AInit,{AOpi}i^j) is data refined by {CS , CInit, {COpi}i^i) if they are 
conformal, and a relation R on AS A CS exists such that 

V CS • CInit => 3 AS • Alnit A R 

and \f i : I 

VAS”; CS-, lAOpG, \AOpp, CS'*weAOp,hRACOp,^3AS'*AOp,AR' 
AS', CS; lAOpi • pre AOpi A R ^ pre COpi □ 
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For the behavioural interpretation of ADTs, the last condition needs to be 
strengthened to an equality. Note that necessarily (due to conformity) lAOpi = 
ICOp, and lAOp^ = \COp„ 

If we leave the state space unchanged, i.e. R equals the identity relation on 
AS , the simpler conditions of “operation refinement” result. 

Definition 11 (Z operation refinement). The standard Z ADT 

{S , Alnit, {AOpi}i(zi) is operation refined by (5, CInit,{COpi}i^i) if they are 
conformal, and 

V 5 • {CInit => Alnit) 

and y i : I 

y S', lAOpi] \AOpi, S' • pveAOpi A COpi ^ AOpi 

V5; lAOpi • pre AOpi ^ pre COpi □ 

For the remainder of this paper, we will concentrate on downward simulation, 
and hence when we refer to refinement we are in fact refering to downward 
simulation only. Similar results can be derived for upward simulation. 



3 Examples 



A two-dimensional world, in which some object moves about and its move- 



ment and position can be observed, i 
{2D,Init,{Move, Where}) where 




represented by the standard Z ADT 

Move 

A2D 



Where 

S2D 
x\, y\ : Z 

x\ = x' A y\ = y' 



Because the state components x and y are directly observable through the Where 
operation, any data refinement of this ADT would involve reconstructing x and y 
in Where. Therefore we will mostly concentrate on operation refinements of this 
specification. As Where is total and deterministic, it cannot be refined any fur- 
ther. Possible operation refinements for Move include the following: 

DontMove Swap 

S2D A2D 

x' = y A y' = X 



StepLeft 

A2D 



x' = x — lAy' = y 



StepRight 

A2D 



x' = x + lAy' = y 
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As Move is total, all of these are refinements in both interpretations (“contract” 
and “behavioural”). 

However, the following systems are not traditionally considered refinements 
of the above ADT. This is in contrast to the fact that we could consider the 
above ADT a partial or abstract description of every one of them. 



Exl A system where Move does not simply have a non-deterministic result, but 
there is external control that we were previously unaware of, i.e. Move is 
replaced by 



Translate 

A2D 

: Z 

x' = X + xl A y' = y + yl 



This is not a valid refinement, because Translate has inputs that Move did 
not have ~ the data type interpretations of these Z ADTs are thus not 
conformal. 

Ex2 A system where the output of Where is in polar coordinates - also not 
allowed because conformity enforces identical output types. (The internal 
representation of the state can be changed to polar coordinates, of course.) 

Ex3 A system where various kinds of Move are possible that we did not distin- 
guish above, e.g. the abstract Move is replaced by a choice of two concrete 
operations StepLeft and StepRight . Again, this would not be a valid refine- 
ment, as the data types involved are not conformal, having different index 
sets. 

Ex4 A system which introduces an internal clock, which is left unchanged by 
Move and Where but incremented by an internal operation Tick, e.g. 



Timed2Dl 

X, y, clock : Z 

Tlnitl 

Timed2Dl 

X = y = clock = 0 

TMovel 

ATimed2Dl 

clock' = clock 



TWherel 

S Timed2Dl 
x\,y\ : Z 

x\ = x' A y\ = y' 

Tickl 

ATimed2Dl 

clock' = clock + 1 
x' = X A y' = y 
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Ex5 A variant^ of Ex4 where the clock counts down and is set to some value 
initially and by every Move operation: 



Timed2D2 

x,y : Z] clock : N 

TInit2 

Timed2D2 



X = y = 0 

TMovc2 

ATimed2D2 



TWherc2 

S Timed2D2 
x\, y\ : Z 

x\ = x' f\ y\ = y' 

Tick2 

ATimed2D2 

clock' = clock — 1 
x' = X A y' = y 



Neither of these is a refinement of 2D according to the Spivey rules. 

Ex6 Another system where the effect of Move is non-deterministic to the ob- 
server, but determined elsewhere; SetNext must be an “internal” operation: 



2Det 

X, y, nextx, nexty : Z 
nextset : bool 

Init 

2Det 

X = y = 0 
-^nextset 

Where 

S2Det 
x\, y\ : Z 

xl = x' A yl = y' 



MovcDct 

A2Det 

nextset A -^nextset' 
x' = nextx A y' = nexty 

SetNext 

A2Det 

x' = X A y' = y 
-^nextset A nextset' 



Again the internal operation is a “stuttering step” when the obvious simu- 
lation is used, but then the precondition of MoveDet is stronger than that 
of Move which is not allowed in usual refinement. (The boolean nextset is 
manipulated in such a way that Move and SetNext can only happen alter- 
nately.) 

Ex7 A system where an additional (external) operation is available, e.g. 

FromQ 

S2D; 
d\ :M 

d\ = \J x"^ N y"^ 



or 

^ This non-divergent variant was suggested by Stefan Kahrs. 
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Reset 

A2D 



x' = y' = 0 



which is not viewed as a concrete occurrence of Move. {Reset does refine 
Move.) Clearly this would not be allowed for reasons of conformity. 

In the following sections, we discuss a number of more liberal notions of refine- 
ment, using the examples above as illustrations. We will consider the liberal- 
isations mostly in isolation, indicating how they can be combined in the final 
section. 



4 lO Refinement 

It has now been mentioned several times that operations and their more concrete 
versions are required to have identical inputs and outputs. This requirement can 
be traced back to the conformity condition on the ADTs when interpreted as 
4-tuple data types. It was mentioned that simulations between those interpreted 
data types consisted of a simulation on the state space, plus identity relations 
between the input and output sequences. Similarly, the initialisation and final- 
isation simply copy the relevant sequence between the global and local state. 
Clearly these identity relations are a focus of potential generalisation. This is 
explored in full detail in [5], based on a generalisation of the “unwinding” in [27]; 
the explanation given here is partially suggested by ideas in [25]. 

Definition 12 (Transformers). An input transformer IT for an operation Op 
is a schema whose outputs exactly match Op's inputs, i.e. for every name xl in 
Op there is a name x\ in IT and vice versa, and all of whose other names are 
inputs. Output transformers are defined analogously. □ 

For a formal definition of these notions, and how 10 transformers are applied 
using the schema piping operator cf. [5]. 

In the relational framework, one should view input transformers as being 
applied to every input value at initialisation, and output transformers as being 
applied (inversely) to every output at finalisation. From the refinement require- 
ments on initialisation and finalisation it follows that input transformers must 
be total on the abstract input: every abstract input must still be allowable. 
Similarly, output transformers should be injective from abstract output to con- 
crete output: different abstract outputs should be distinguishable by leading to 
different concrete outputs. 

In [5] we present an 10 refinement rule which allows 10 refinement concur- 
rently with data refinement. However, due to the large number of variables in 
that rule (all those in the data refinement rule of Definition 10, plus 4 more), we 
prefer to separate 10 refinement steps from other data refinement steps. 
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Definition 13 (Simple input refinement). A simple input refinement of an 
ADT (5, Init, {Opi}i^i) is the replacement of an operation Opi by an operation 
Opi” , such that for some input transformer IT 

IT » Op,” = Op, 

In that case, IT needs to be added to the initialisation of the ADT in its inter- 
pretation as a 4-tuple data type. □ 

The most obvious instances of simple input refinement are: 

~ adding an input whose value is not used in the new operation; 

~ replacing an input by an isomorphic one (total bijective input transformer) 

The development in Exl can now be justified in two steps: the first is a simple 
input refinement step introducing inputs x? and yl which are not used. The 
corresponding input transformer takes the input of Move - there was none, so it 
is the empty tuple ~ and relates it to any pair x?, yl : Z. The second step is then 
an operation refinement, reducing the non-determinism by using x? and yl. 

Definition 14 (Simple output refinement). A simple output refinement of 
an ADT (S , Init, {Opi}i^i) is the replacement of an operation Opi by an oper- 
ation Opi \ such that for some output transformer OT 

Op^ » OT = Op, 

In that case, OT needs to be added to the finalisation of the ADT in its inter- 
pretation as a 4-tuple data type. □ 

The most obvious instances of simple output refinement are: 

— adding an output whose value does not depend on the state; 

— replacing an output by an isomorphic one (total bijective output trans- 
former) 

Example Ex2 is of course an instance of a total bijective output transformer. 

It is clear that the input and output transformers used in refinement steps 
(unlike simulation relations which play a similar role), need to be preserved in 
order to recapture the abstract 10. For that reason it might be best to define an 
alternative kind of ADT which incorporates input and output transformers for 
every operation. 

The central observation made in [.5] is that 10 refinement rules as presented 
in this section are (still) a consequence of the data refinement theorems of He, 
Hoare and Sanders. For that reason, this liberalisation of refinement does not 
require a generalisation of the notion of refinement as given in Definition 4. 
Rather, it only requires a generalisation of the embedding of a Z ADT in the 
4-tuple data type (Definition 8) which takes into account any input and output 
transformers used. However, we do not present this generalisation here as it 
would be dominated by the transformations required to treat all operations as 
having the same input and output types. 
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5 Adding Internal Operations 

There are two approaches to refining an ADT when we wish to add internal 
operations: use the existing refinement rules (treating the internal operations 
separately), or generalise the existing rules. 

We can maintain the existing rules by using so called stuttering steps in 
a verification of a refinement. A stuttering step is an operation which does not 
change the abstract state, represented in Z by skip = [S' S'] if S is the state space. 
We can then ask that all internal operations in an ADT refine skip when verifying 
a refinement. Since a stuttering step is total the requirements for refining to a 
concrete internal operation CIOp are 

R A CIOp => skip 

R pre CIOp 

where skip is the stuttering step in the original ADT that is being refined. 

For example, example Ex4 describes a valid refinement if we use stuttering 
steps. Considering the observable operations we see that TMovel refines Move 
and TWherel refines Where. In addition, the internal operation Tickl refines 
skip = [S2D] where the retrieve relation is the identity on the common compo- 
nents of the two ADTs (i.e. x and y). 

There are a number of problems with this approach though. The more trivial 
objection is that we have had to break conformity and introduce stuttering 
steps in the original ADT before we can verify the refinement. This seems rather 
artificial. More important is the interpretation of internal operations and the 
potential for divergence through their introduction. 

Internal operations are interpreted as being internal because they are not 
under the control of the environment of the ADT, and in some way they repre- 
sent operations that the system can invoke whenever their preconditions hold. 
Because of this we do not want divergence to be introduced by a refinement as a 
result of their infinitely repeated application. However, refinements of skip have 
precisely this property. Look at Tickl in Ex4. As an internal operation there is 
nothing to stop it being repeatedly invoked, this causes the system to go into 
livelock preventing any other operation from occurring. 

Related to this issue is a question concerning the meaning of a precondition 
of an internal operation. For internal operations we need to take a “behavioural” 
interpretation of their precondition, regardless of the interpretation taken for the 
observable operations, because we wish them to be applicable only at certain 
values in the state space. To see this consider an internal protocol timeout being 
modelled by an internal operation. Clearly we wish the timeout to be invoked 
only at certain points within the protocol’s behaviour. Incorrect refinements 
would result if we could weaken its precondition and timeout at arbitrary points. 
Therefore internal operations may not be applied outside their precondition. 

In Ex5 we have attempted to solve the divergence in Ex4 by only allowing 
the Tick operation to be invoked a finite number of times. However, this ADT is 
not a refinement of 2D even with stuttering steps. In particular, we fail to have 

R ^ pre Tick2 
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because the precondition of Tick2 requires that clock > 1 which we cannot 
guarantee from R. 

A solution to this problem is to draw upon the experience of refining internal 
operations in process algebras. What we need to do in general is to view refine- 
ment in terms of external behaviour, and we can define a set of weak refinement 
rules [13,14] that ensure the observable behaviour of the refined ADT is a valid 
refinement of the observable behaviour of the original ADT. 

We will still assume the sets of observable operations in a refinement are 
conformal, but we extend the notion of a data type to include a set of inter- 
nal operations {IOpj}j^j for some index set J. The definition of conformal 
is adjusted to require that the indexing sets of observable operations coincide. 
However, we make no requirements on the conformity of the internal operations, 
and indeed weak refinement will allow the introduction or removal of internal 
operations during a refinement. 

The definition of weak refinement is motivated by the approach taken to 
internal actions in a process algebra. In particular we move from the application 
of a single observable operation Op to a situation where a finite number of 
internal operations are allowed before and after the occurrence of that operation. 
This corresponds to the change from P — ^ Q to P Q in a process algebra 
when internal (i.e. unobservable) actions are taken into account. 

Data refinement now requires that for any given program (i.e. choice of ob- 
servable operations), every complete program over C possibly involving internal 
operations is contained in some complete program over A (possibly involving in- 
ternal operations but using the same choice of observable operations) . Formally, 
this leads to the following generalisation of Definition 4. 

Definition 15 (Weak data refinement). Consider data types 

A= {AS,AInit^{AOpi\i^i\jj,AFin) and C = (CS”, CInit, {COpi}i^iuK, CFin), 
where J and K are both disjoint from /; the visible (external) operations in both 
datatypes are those from I only. ( J and K need not be disjoint.) For a program p 
over /, define its set of programs with internal operations in A as follows (and 
similarly for C)'fi 

PA = {qa \ q & seq(/ U J) A q \ I = p} 

Now C weakly refines A iff for each finite sequence p over I, 

y X : Pc • ^ y ■ PA • X C y □ 

Observe that this is a correct generalisation: when J and K are empty, pA and 
Pc are singleton sets containing only pA and pc, respectively. 

To verify such weak refinements we adapt the simulation rules given above. 
To do so we encode the idea of a finite number of internal operations before 
and after an observable operation in the Z schema calculus. In order to avoid 

The filtering expression s [ d takes from a sequence s only those elements contained 
in set A. 



2 
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quantifications over sequences of internal operations in the definition of weak 
refinement, we encode “all possible finite internal evolution” for a specification 
as a single operation Int (allowing us to write Int § Op § Int). The details of 
how this is done are in [14]. Internal operations in the concrete and abstract 
specifications are differentiated by using the subscripts C and A on Int. 

The definition of a weak downward simulation then has the same form as 
the standard Z definition. We typically replace operations Op by Int § Op § Int 
to allow for internal evolution before and after. To prevent the introduction of 
divergence, there are two additional conditions based upon those in [8]. 

Definition 16 (Weak downward simulation). Given Z data types A = 
(^5, Alnit, {AOpi}i^njj) and 0 — (^CS ^ Clntt^ where J and K 

(the index sets denoting internal operations) are both disjoint from I . The rela- 
tion R on AS A CS is a weak downward simulation from d. to G if 

V CS • {CInit g Intc) => 3 AS • {Alnit § Int a) A R 
and y i : I U {J n K) 

V AS] CS] CS' • pre{IntA g AOpi) A R A {Intc 9 COpi g Into) 

=> 3 AS' • R' A {Int A 9 AOpi g Int a) 

V AS] CS • pre{IntA g AOpi) A i? =4> pre{Intc g COpi) 

where we have elided quantification over inputs and outputs (as in Definition 10) 
and (for technical reasons) redefine the precondition of an internal operation lOp 
acting on state S as 

prelOp = press = S 

In addition, we require the existence of a well-founded set WF with partial 
order <, and a variant E which is an expression in the concrete state variables 
satisfying the following conditions: 

T>1 RA E ^ WF 

D2 yi-. K • RA COp^ h E' < E □ 

Note that although internal operations decrease the variant, there are no 
constraints on observable operations, which are allowed to increase the variant. 
This means that an internal operation can be invoked an infinite number of 
times, but not in an infinite sequence between observable operations. 

With these definitions in place we can see that Ex4 does not describe a weak 
refinement of 2D because the internal operation Tick! can introduce divergence, 
i.e. it breaks the divergence criteria D1 and D2. 

However, Ex5 does represent a weak refinement of 2D. To see this note first 
that there are no internal operations in 2D, secondly we note that Tick2 does 
not change the effect of any other variable apart from clock, and therefore does 
not alter the effect of the observable operations. This means that the conditions 
on the observable operations hold. The remaining conditions are to verify the 
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correctness of Tick2, but since RA Tick2 =A S2D this follows, and the divergence 
criteria which in this case is trivial (N being the set WF). 

Weak refinement can also be used to verify Ex6. Here SetNext is the internal 
operation. With a retrieve relation being the identity on x and y, correctness 
and divergence for SetNext are trivial. However, MoveDet does not refine Move 
without consideration of the internal SetNext. But since applicability in weak 
refinement just requires that 

pre Move A R ^ pi'e{SetNext § MoveDet) 
and similarly for correctness, the conditions for weak refinement are met. 



6 Adding Visible Operations 

Ideas from process algebra served to motivate a generalisation of Definition 4 
that allows internal actions. From the fact that most refinement relations in 
process algebra do not allow addition of new observable actions (“extension of 
the alphabet”) we might then conclude that refinement of ADTs should not lead 
to extended sets of visible operations either. 

In the behavioural approach this appears to be a defensible position. When 
our specification is some usual drinks machine using actions coin, coffee, and 
tea, we would be distinctly unhappy if the implementing machine came with 
a button that allowed the removal of all the money from the machine. Such 
an operation, which clearly disrupts the system state, should not be possible. 
However, adding an operation which displayed the amount of coffee left in the 
machine (without changing the state) would maybe not be as harmful. 

However, in the “contract” interpretation it is less clear that allowing extra 
operations should be unsound. The canonical view of an ADT in this interpreta- 
tion is of a software component whose operations are only required to function 
as specified when their preconditions hold. There being additional operations 
available in the component besides the ones the customer intends to use does 
not appear to be a problem. Indeed, it would require some disingenuity to dis- 
tinguish between an operation that “does not exist” and one whose precondition 
is universally false. The consequence of identifying those is the obvious one: in 
the behavioural approach, no extra operations may be added; in the contract 
approach, they may be added at will. 

The corresponding generalisation of Definition 4 is given below. Implicitly 
it generalises the notion of conformity to the concrete index set containing the 
abstract index set, and as before it quantifies over all abstract programs only, 
not making any restrictions on programs containing indices not in the abstract 
index set. This is in contrast to weak refinement, where all concrete behaviours 
are considered. 

Definition 17 (Data refinement with alphabet extension). 

Consider data types A = {AS , Alnit, {AOpi}i^i, AFin) and 
C = {CS, CInit, {COpi}i^iuj, CFin). Now C refines A with alphabet extension 
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iff for each finite sequence p over I, 

PC Qpa n 

Observe that this is a correct generalisation: when J is empty, this reduces to 
Definition 4. The informal justification of this definition is that it ensures that 
the concrete system behaves like the abstract one when executing any sequence 
of instructions that the abstract one allowed. 

In this view, it is fine to add operations (like Reset and FromO in Ex7) in 
the contract interpretation. In the behavioural interpretation, neither is allowed, 
although adding FromO appears harmless. However, this is not only due to the 
fact that FromO leaves the state unchanged, but also to it presenting information 
that was derivable from the other “observing” operation, i.e. Where. (For a more 
extensive discussion on observing operations and their refinement rules, cf. [4].) 

However, this generalisation does not cover the case where in the abstract 
ADT we have a single operation which has multiple concrete representations. 
Example Ex3 suggested that an abstract Move could actually be reflected in 
various sorts of concrete move operations, e.g. StepLeft and StepRight. In other 
words, we would like to link Move to StepLeft V StepRight . Allowing this re- 
quires a further generalisation of Definition 17 and in particular the notion of 
conformity: we would give an explicit mapping from the abstract index set to the 
concrete index set. We shall call this an alphabet translation. It would be total 
(every abstract operation has at least one concrete counterpart) and injective 
(every concrete operation reflects at most one abstract operation). 

Definition 18 (Data refinement with alphabet translation). 

Consider data types A = {AS , Alnit, {AOpi}i^i, AFin) and 
C = {CS,CInit,{COpi}i^j,CFin). Let a be a total and injective mapping 
from I to J . Define the extension a* of a to sequences by 

(s,t)€a* = ffs = fftA\/i:l..ffsu{si,ti)Ga 

Then C refines A with alphabet translation a iff 

V p : seq I • 3 q : seq J • {p, q) G a* A qc Q Pa O 

Observe that this is a correct generalisation of Definition 17: the identity relation 
on / is a total and injective mapping from I to / U J, and its extension to 
sequences is the identity over seq/. 

7 Splitting Operations 

In Section 5 we relaxed the notion of conformity to allow the introduction of 
internal operations, and then in Section 6 we allowed the abstract operations to 
be a subset of the concrete ones. In fact we can go even further and relax the 
requirement of conformity between the sets of observable operations, and the 
purpose of doing so is to support action (or non-atomic) refinement. 
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Action refinement is when a single operation in the original ADT is refined 
by, not one, but a sequence of concrete operations in the refined data type. Such 
action refinements arise in a number of settings quite naturally (see [1]) and they 
allow initial specifications to be described independently of the structure of the 
eventual implementation. The desired final structure can then be introduced by 
non-atomic refinements which supports a change of operation granularity absent 
if we require conformity. [11] discusses how action refinement can be supported 
in Z, and we survey that approach here. 

We are now in a situation where data types A and C have different indexing 
sets {I A and Ic respectively) for their observable operations, so to discuss refine- 
ment we need to be informed of the relationship between them. In general this 
could be an arbitrary relation, but we consider here a mapping p : > seq/c, 

so that p describes the concrete counterparts for each operation in A. For ex- 
ample, if p(3) = (6,4) then this means that operation AOps is refined by the 
sequence COpQ g COpi. Our liberalised notion of refinement then becomes 

Definition 19 (Action refinement). C action refines A with respect to p iff 
for each finite sequence p over Ia, (^/(p ° p))c O pa-^ O 

For example, if AOpi is refined by the sequence COpi g COp 2 then action 
refinement requires that, amongst others. Cl g COpi g COp 2 g CF C AI g AOpi g 
AF . This is a sort of liveness condition: to every abstract program there is an 
equivalent concrete one. However, the liberalisation comes at the expense of 
safety: there might be concrete programs that have no abstract counterpart. For 
example. Cl g COp 2 g CF might not correspond to anything at the abstract level. 
The requirements of safety are discussed in more detail in [11], here we briefly 
sketch the consequences of Definition 19. 

Simulations can be used to make step-by-step comparisons as before. To ease 
the presentation let us suppose that in the two data types the indexes coin- 
cide except that one of the abstract operations AOp is refined by the sequence 
COpi 9 COp 2 - When we consider the applicability and correctness conditions 
from Definition 10 in this new setting they become three conditions, namely: 

{dom AOp <3 i? g COpi 9 COp 2 ) Q AOp g R 
ran((dom AOp) <1 i?) C dom COpi 
ran((dom AOp) <1 i? g OOpi) C dom COp 2 

These have obvious counterparts in Z which we give now (we have elided the 
quantification over inputs and outputs) . This is called action refinement without 
10 transformations because we have not considered how to deal with input and 
output as yet. 

Definition 20 (Action refinement without lO transformations). 

Let R be the retrieve relation between data types {AS,AI,{AOp}) and 

® This Z expression (ab)uses the representation of sequences by functions in Z. Func- 
tional programmers might be happier with {concat{map p p))c ^ pA 



162 Eerke Boiten and John Derrick 

{CS, Cl , {COpi, COp2}). R is an action downward simulation if the following 
hold 

WAS; CS] CS' • pre AOp A {COpi g COP2) A R ^ 3 Astute' • R' A AOp 

V AS] CS • pre AOp A R => pre COpi 

V AS] CS • pre AOp A R A COpi pre COp2 

together with the initialisation condition. □ 

As an example, we could consider Example Ex6 as an action refinement, 
refining Move into two observable operations, SetNext followed by MoveDet. It 
is easy to check that this is a correct action refinement. 

What is the price of liberalisation though? Well, since SetNext and MoveDet 
are both observable operations, a valid program might only include one half of 
the concrete decomposition. The simulation rules provide no regulation to ensure 
that nothing bad happens. We can however introduce some extra policing to 
ensure the operations in the concrete decomposition are well behaved. 

For example, we know that if COp2 happens straight after COpi, then the 
composition refines AOp. But what happens if we are in a concrete state that 
was not a result of a COpi invocation? We could now require that COp2 had 
no effect, i.e. outside the range of COp\, COp2 should be skip. This can be 
formalised as"* 

R 9 (ran COpi < COP2) C R 

An alternative policy is that if COpi has occurred, then the only operation 
to have an effect should be COp2, i.e. other operations cannot interfere with a 
concrete transaction yet to be completed. This is modelled by saying that COpi 
followed by any other operation (apart from COp2) is the same as COpp. 

COpi g Op C COpi for all Op yf COp2 

The opposite way to achieve a safety property on the first part of a decom- 
position is to require the cancellation of a half completed sequence of concrete 
operations, that is 

COpi g Op C Op for all Op yf COp2 

We should note here in passing that these properties, although preserved if 
we take the weakest refinement, are not preserved by arbitrary refinements. 



7.1 Inputs and Outputs 

The action refinement of Move into SetNext followed by MoveDet is rather triv- 
ial. The real challenge in action refinement is to deal with operations containing 
input and output, and ask how we can decompose this 10 across the sequence 



^ denotes domain anti-restriction, i.e. S R = {(a, y) £ R \ x ^ S'}. 



Liberating Data Refinement 163 



of concrete operations. For example, we might wish to refine Translate in Exl 
into two operations each of which translates a single coordinate. 

This can be supported if we combine the above action refinement rules with 
10 refinement discussed in Section 4. We will not go into all the details here, but 
essentially we use input and output transformers whose structure is identical to 
that of the mapping p : Ia ^ seqlc- 

Thus instead of input and output transformers IT and OT which change the 
input/output of one operation into another, we use mappings and rout where 

rin : Ainput — > seq Ginput 

rout '■ Aoutput — *■ seq Coutput 



This allows input and output to be split across an action refinement (details 
of the full definition are in [11]). For example. Translate can be action refined 
into TranslateX § TranslateY where 



TranslateX 
A2D 
xl : Z 

x' = X + xl 

y' = y 



Translate Y 
A2D 
y?:Z 

x' = X 

y' = y + y? 



8 Conclusions and Related Work 

We have presented an overview of refinement relations that could be used for 
abstract data types in (a language like) Z, going from the traditional rules to a 
collection of more liberal rules. All of these have been motivated as consequences 
or extensions of the data refinement theory set forth by He, Hoare and Sanders. 

The generalisations proposed are compatible with each other. First, 10 refine- 
ment can be combined with the other generalisations, as it (like traditional re- 
finement) derives from Definition 4, which is generalised by the other definitions. 
Weak refinement and alphabet translation are orthogonal, as in Definition 18 it 
is possible to generalise by taking the closures of p and q under composition 
with internal behaviour, with the appropriate extra quantification. The same 
sort of construction could be applied to Definition 19 to combine action refine- 
ment with weak refinement. The combination of alphabet translation and action 
refinement is obtained by taking the decomposition mapping p to be an arbitrary 
total relation between abstract indices and sequences of concrete indices. Injec- 
tivity of alphabet translation may need to be weakened or reinterpreted in that 
situation. The combination of all three generalisations seems feasible theoreti- 
cally, although mind-boggling in practice. As is already the case with operation 
refinement and data refinement, it is probably advisable to perform refinement 
in “one dimension at a time” . 

Most of the refinement relations considered have been shown to be useful 
in our work on viewpoint specification in Z [3,6]. We now have a refinement 
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relation which allows the omission of operations, inputs, and outputs from the 
specification of viewpoints to which they are of no concern. Action refinement 
will, in addition to that, allow viewpoints at different levels of granularity. 

With the exception of the considerations on adding external operations, 
the refinement rules have been presented in our earlier papers [13,14,5,11], some 
also in work by Woodcock and others [7,25]. However, the relationship between 
these rules and (generalisations of) Definition 4 has not previously been made 
explicit, and we have not previously analysed the compatibility of these rules. 
Further details are expected to appear in [12]. 

A comprehensive and detailed overview of the semantic foundations of data 
refinement with an extensive bibliography can be found in [9]. 

The issue of providing extra methods in subtypes/subclasses while preserv- 
ing a semantical (“refinement”) interpretation of inheritance has been studied 
extensively, most notably in [21,20]. 
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Abstract. Compositional designs require component specifications that 
can be composed: Designers have to be able to deduce system properties 
from components specifications. On the other hand, components specifi- 
cations should be abstract enough to allow component reuse and to hide 
substantial parts of correctness proofs in components verifications. Part 
of the problem is that too abstract specifications do not contain enough 
information to be composed. Therefore, the right balance between ab- 
straction and composability must be found. This paper explores the sys- 
tematic construction of abstract specifications that can be composed 
through specific forms of composition called existential and universal. 



1 Motivations 

1.1 Specifications and Proofs in Compositional Design 

This paper explores proofs of correctness of systems constructed by composing 
components. The promise of component technology is that the same component 
can be used in many systems, and thus the effort that goes into specifying, prov- 
ing and implementing components can be exploited many times. Compositional 
design is productive when the effort required to find and compose components is 
less than the effort required to design an entire system from scratch. Therefore, 
greater productivity is achieved by using components that embody substantial 
effort. Rapidly growing commercial efforts into developing libraries of software 
components (using, for instance, Java beans, Microsoft DNA or CORBA) are 
witnesses to the industrial interest in component-based designs. 

In this paper, we explore the appropriate level of detail for component spec- 
ifications. Specifications that are too abstract may be too weak to be useful in 
composition. Specifications that are too detailed are difficult to reuse and may 
require systems designers to derive more useful and more abstract specifications 
from the detailed ones. Component designers should identify those component 
properties that they expect systems designers to need, and design their compo- 
nents to have these properties. This may require putting a substantial amount 
of effort into proving that a component has properties that are useful in com- 
position. However, these proofs are achieved at the component level and will be 
reused each time the component is part of a new system. 
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Figure 1 depicts specifications and proofs in a compositional design. Proofs 
labeled with ‘T’ are those component-correctness proofs that are left unchanged 
through composition and that can be reused in the design of several systems. 
Proofs labeled with ‘C’ are proofs of composition, i.e., proofs of system properties 
from component properties. The way components are specified, and especially 
the level of abstraction of their specifications, influences the amount of effort that 
has to be put in T-proofs and in C-proofs. A good framework for composition 
should allow us to put most of the effort in T-proofs and keep C-proofs as simple 
as possible. 




Fig. 1. A compositional design 



1.2 Abstract Specifications that Compose 

We introduce an informal concept of compositional properties to motivate our 
exploration, and define terms precisely later. Compositional properties are those 
classes of properties that allow us to deduce system properties from component 
properties using simple rules. For example, mass is a compositional property 
because the mass of a system can be deduced in a simple way from the masses of 
components: the system mass is the sum of component masses. By contrast, heat 
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emitted does not appear to be a compositional property because the heat emitted 
by a system depends in very complex ways on the shapes, masses, insulation 
properties and locations of the components. However, engineers have to compute 
properties of composed systems given properties of components, whether the 
properties are compositional or not. 

In this paper, we restrict ourselves to properties that are predicates on sys- 
tems. A compositional property is a property whose truth can be established 
using simple rules from properties of components. The questions that we are 
exploring are the following: 

~ What are interesting compositional properties and what are the correspond- 
ing proof rules? 

~ How can we deduce any system property from conjunctions of these compo- 
sitional properties? 

The simplest rules are those that establish that a property X holds for a 
system given that (i) property X holds for at least one component, or (ii) prop- 
erty X holds for all components. In this paper, we focus our attention on two 
kinds of compositional properties: existential properties and universal properties. 
A property is an existential property exactly when for all systems, a system has 
the property if there exists a component of the system that has the property. A 
property is a universal property exactly when for all systems, a system has the 
property if all components of the system have the property. 

Many interesting properties are neither universal nor existential. So, we de- 
duce such system properties in two steps: 

— First, we specify components as conjunctions of universal and existential 
properties so that we can readily derive universal and existential system 
properties from component properties, 

— then we derive the system properties we need from these universal and exis- 
tential system properties. 



1.3 An Illustrative Digression 

The next few paragraphs give the reader some intuition of the tradeoff between 
too much detail and too little detail in component specification by considering 
the difference between two kinds of properties of reactive systems: invariants and 
always. This issue has been discussed earlier, for instance in [15,14]. 

In the context of systems (and components) defined by their infinite compu- 
tations (reactive systems), a state predicate is an invariant property of a program 
exactly when 

~ the predicate holds in the initial state of all computations of the program, 
and 

— all atomic transitions from states in which the predicate holds are to states 
in which the predicate continues to hold. 
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A state predicate is an always property of a program exactly when it holds 
in all states of all computations of the program. 

Invariants tell us about transitions from all states, whether reachable or un- 
reachable. By contrast, always properties tell us about reachable states only. 

All invariant properties are always properties. An always property need not 
be an invariant property because the system may have a transition from an 
unreachable state for which the property holds to a state in which the property 
does not hold. 

An invariant property is universal. If all components of a system enjoy an 
invariant property then the composed system also enjoys that invariant property. 
By contrast, an always property is not necessarily universal. 

Properties relevant to a system in isolation (i.e., a system that is not com- 
posed with other systems) are different from properties relevant to components 
that we expect to compose with other components. When we study the behavior 
of a system in isolation, its always properties are relevant, and it is not helpful to 
know whether an always property is also an invariant property. In this context, 
always properties offer the appropriate degree of abstraction. 

We cannot, however, deduce system properties from always properties of 
components. The concurrent composition of systems, all of which have the al- 
ways property that a bank account is nonnegative, may yield a system in which 
the bank account does indeed become negative. This is because a state that is 
unreachable when a system executes in isolation may be reachable when that 
system is composed with other systems. 

Thus, always properties are too weak to be helpful in composition even 
though they have the right degree of abstraction for systems in isolation. How- 
ever, invariant properties are helpful in composition because they are universal 
properties. Then, we can weaken invariants to obtain desired always properties. 

In our research, we are looking for simple rules that allow us to prove system 
properties from component properties, for certain restricted kinds of proper- 
ties. We don’t expect to find simple rules for deducing arbitrary kinds of system 
properties from arbitrary kinds of component properties. The challenge is to find 
appropriate kinds of compositional properties, and then deduce desired proper- 
ties by weakening conjunctions of compositional properties. 



1.4 Predicate Transformers for Composition 

Consider the following problem. A designer of a component F would like to 
demonstrate that any system that has F as a, component enjoys a property X. If 
property X is an existential property, and if F has property A, then any system 
that has T’ as a component will enjoy property X. What if X is not existential? 

If we could define a predicate transformer £ where £.X is existential and at 
least as strong as A, and if we could demonstrate that component F has property 
£.X, then any system that includes component F would also have existential 
property £.A, and therefore would also enjoy the weaker property A. Therefore, 
component designers can ensure that all systems that contains their components 
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have a property X by proving that their components have a stronger existential 
property E.X. 

What requirements should we place on predicate transformer E other than 
that E.X must be stronger than XI The obvious answer is that E.X should be as 
weak as possible. Ideally, it should be the weakest existential property stronger 
than X, provided that such a property exists. 

Now consider the analogous case for universal properties. We would like to 
prove that a system has a property X if all its components have property X even 
though X is not a universal property. So, we attempt to introduce a predicate 
transformer U with the requirement that lA.X is universal and stronger than X . 
If we can prove that all components of a system have property U.X then we can 
conclude that the system enjoys this property and hence also enjoys the weaker 
property X. 

Can we require that lA.X be the weakest universal property stronger than XI 
We can show that we cannot do so because there does not exist, in general, a 
weakest universal property stronger than X. The idea of a predicate transformer 
lA where U.X is universal and is stronger than X will indeed help in engineering 
systems by composing components, but we must define U in some way other than 
being the weakest. We do not define a transformer U in this paper. We only show 
that U.X cannot be defined as the weakest universal property stronger than X. 

1.5 Overview 

Our components are abstract entities. They are not necessarily programs and 
they may not have “states” or “computations”. We consider composition op- 
erators that have certain algebraic properties such as associativity and explore 
theorems that are derived from these properties. 

In the next section we introduce components, their properties and the com- 
position law. We also introduce a simple model of components that is used as an 
example throughout the paper, as well as our notation and vocabulary. Section 3 
presents the definition of existential and universal properties. In this section we 
introduce a special form of existential properties called guarantees. The main 
results of the paper are presented in section 4 where a property transformer 
E for existential composition is defined. Theorems with regard to this property 
transformer are presented, and the reasons why a property transformer for uni- 
versal composition is not defined in the same way are explored. An example, that 
uses the simple model previously defined, concludes that section. In section 5, the 
work is compared with other proposed approaches and some remaining questions 
are formulated. Finally, conclusions are drawn in section 6. 

2 Terminology and Notations 

2.1 Predicates, Components and Properties 

Function application is denoted with a dot, which has higher precedence than 
boolean operators and which associates to the left (i.e., a.b.c = (a.b).c). The 
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application of function A to parameter x is denoted by A.x. Predicates are 
boolean valued functions. We denote predicates with the capital letters X, V, 
Z, . . . Components are denoted with the capital letters F, G, H, . . . Proper- 
ties are predicates on components: X . F is the boolean “property X holds in 
component P”’ . 

2.2 Everywhere Operator 

Following [11], we use square brackets to denote that a predicate is ''''everywhere 
true''. For any property X^ [X] is the boolean “X holds for all systems”. 

2.3 Composition 

We restrict our attention to a single composition operator o. We postulate the 
existence of a binary relation between components, and we restrict our atten- 
tion to composition of components that satisfy this relation. We denote by Fy/G 
the fact that components F and G can be composed, and then their composition 
is denoted by FoG. 

We assume the existence of a UNIT component such that, for any F: 

UNITy/F A Fy/UNIT A {UNIToF = FoUNIT = F) . (1) 

Furthermore, we assume that o is associative and that, for any F, G and H: 

Fy/G A {FoG)^H = Gy/H A F^{GoH) . (2) 

The left-hand side and the right-hand side denote (equivalently) that E, G and H 
can be composed (in that order). We introduce the shortcut Fy^Gy/H to repre- 
sent either side of the equivalence. 

Note that we do not assume here any other property of the operator o, such 
as symmetry or idempotency. 

2.4 Bags of Colored Balls 

To illustrate the results presented in this paper, we use a model for components 
defined as follows: 

— components are bags of colored balls; 

— bags can always be composed {Fy/G for all F and G) and composition is 
the union of contents of the component bags in the composed bag; 

~ the UNIT element is the empty bag. Note that properties of the form “all 
balls in the hag are red" hold in UNIT. 

Composition here is always defined and is symmetric (Abelian monoid) . This 
needs not be the case for more interesting models. For instance, sequential com- 
position of programs is not symmetric and parallel composition of processes may 
not always be possible (for example, if a process references a local variable from 
another process). 
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3 Existential and Universal Composition 

We start our exploration of compositional properties by studying properties that 
obey certain rules of universal and existential quantification. 

3.1 Existential Properties 

A property X is existential (denoted by the boolean exist. X) exactly when 

exist.X = (VE, G : Fy/G : X . F \J X .G ^ X . FoG) . (3) 

A system enjoys an existential property if it has a component that enjoys that 
property. 

3.2 Universal Properties 

A property X is universal (denoted by the boolean univ.X) exactly when: 

univ.X = (VF, G : Fy'G : X .F AX .G ^ X . FoG) . (4) 

A system enjoys a universal property if all its components enjoy that property. 
Note that any existential property is also universal: 

exist.X => univ.X . 



3.3 ^^guarantees'''’ Properties 

In this section, we show that there is a systematic way to build existential prop- 
erties from any property. We introduce a function guarantees, from pairs of 
properties to properties: 



X guarantees Y . F 

A 

(VF, K : Hy/F^K : A . HoFoK Y . HoFoK) . 



(5) 



Properties of the form X guarantees Y are called guarantees properties. 

An important result about guarantees properties is that, for any X and Y , 
X guarantees Y is existential: 



Proposition 1 



exist.{X guarantees Y) 



Proof: See corollary of proposition 12. 



□ 
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3.4 Basic Rules 

In this section, we give basic rules relating existential properties, universal prop- 
erties and guarantees [4] . Let E and E' be existential properties and U and U' 
be universal properties. Then, it can easily be proved that: 

exist -{E /\ E'), exist ■ {E y E')^ 

univ-{U f\U), univ-{U\JE). 

Note that the disjunction of universal properties is not universal in general, 
and that strengthening or weakening existential or universal properties does not 
preserve their existential or universal characteristics. 

Furthermore, for any properties X, Y , X' , Y' and any existential property E: 

E. UNIT = [E= true], 

[X ^Y] = [X guarantees Y], 

[(X guarantees Y) {X ^ F)], 

[(X guarantees Y) A {Y guarantees Z) (X guarantees Z)], 

[(X guarantees Y) A (X' guarantees Y') (X A X' guarantees Y A F')], 

[(X guarantees Y) A (X' guarantees Y') => (X V X' guarantees Y V F')]. 

3.5 Example 

In the bag of colored balls model: 

exist ■ (bag contains at least 3 balls), 

exist ■ (bag contains at least 2 red balls and at least 1 blue ball), 
exist ■ (bag contains at least 1 red ball) guarantees (bag contains 
• at least 2 colors), 
univ ■ (all balls in bag are black), 

univ ■ (all balls in bag are red, or bag contains at least 1 blue ball). 

The following properties are neither existential nor universal: 

(all balls in the bag are red, or all balls in the bag are blue), 

(if the bag contains at least 1 red ball, then the bag contains at least 2 colors). 

4 The Property Transformer £ 

4.1 S and S' 

In this section, we show that, for any property X, there exists a weakest exis- 
tential property stronger than X, denoted by £.X. Note that X holds in any 
system containing a component for which S.X holds. 

We provide two equivalent formulations for £: 

S.X = (3F : [F ^ X] A exist.Y : F), 

S'.X .F = (ViJ, K : Hy/F^K : X . HoFoK) . 



( 6 ) 

(7) 
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Instead of proving directly that [£ = £'], we prove that S.X is the weakest 
existential property stronger than X, and that £' .X also is the weakest existential 
property stronger than X. From the uniqueness of such a weakest property, we 
conclude that [£ = £'] . 

Note that, by construction, £.X is weaker than any existential property 
stronger than X, but we have to prove that £.X is existential. On the other 
hand £' .X is defined to be existential, but we have to prove that it is the weak- 
est existential property stronger than X . 

4.2 £.X is the Weakest Existential Property Stronger than X 

We consider the equation (in predicates): 



It is well known [11] that such an equation has a weakest solution exactly 
when the disjunction of all solutions is itself a solution, and then this disjunction 
is the weakest solution. 

From definition (6), £X is the disjunction of all the solutions of equation 
(8). Therefore, £X is the weakest existential property stronger than X if and 
only if £.X is a solution of equation (8). The proof obligation is: 



Proof: 

= {Definition of £ (6)} 

(3F : [r X] A exist.Y : Y) 

=> |[F => X] A exist.Y => [F => X], and 3Y is monotonic} 

(3F : [F ^ X] : F) 

=> |[[F ^ X] f\Y => X], and 3F is monotonic} 

(3F :: X) 

= {No free F in X} 

X □ 

Proposition 3 exist.{£ .X) 



F : [F X] A exist.Y . 



( 8 ) 



[£.X ^ X] A exist.{£ .X) . 



Proposition 2 



[£.X ^ X] 



Proof: We consider two components F and G such that F^JG and we prove that 
£.X . F ^ £.X . FoG. By a similar argument, £ .X . G ^ £ .X . FoG, and there- 
fore we deduce that £.X . F V £.X . G ^ £.X . FoG, i.e., that £.X is existential. 
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£.X .FAF^G 
= {Definition of £ (6)} 

(3F : [r ^ X] A exist.Y : Y) . F A F^G 
= {Predicate calculus} 

(3F : [F ^ X] A exist.Y :Y.F A F^G) 

= {Duplicate and expand exist.Y (3)} 

(3F : [F ^ X] A exist.Y : 

(VF', G' : F'y'G' : Y.F' V Y.G' ^ Y.F'oG') A Y.F A F^G) 

^ {Choose F' := F and G' := G} 

(3F : [F ^ X] A exist.Y : (F^G ^ (FF V FG ^ Y.FoG)) A Y.F A F^G) 
=> {Modus ponensj 

(3F : [F ^ X] A exist.Y : Y.FoG) 

= {Predicate calculus} 

(3F :[Y^X]A exist.Y : F) . FoG 
= {Definition of £ (6)} 

£.X.FoG □ 

Proposition 4 For any property X , there exists a weakest existential property 
stronger than X and it is £.X. 

Proof: From propositions 2 and 3 and the characterization of extreme solutions 
of equations in predicates. □ 

4.3 £'.X is the Weakest Existential Property Stronger than X 

In this section, we prove that £\X also is the weakest existential property 
stronger than X. We prove first that £\X is solution of equation (8), and then 
that any other solution is stronger than £'.X. 

Proposition 5 \£' .X => X\ 

Proof: 

£'.X.F 

= {Definition of £' (7)} 

(VF, K : H^F^K : X . HoFoK) 

^ {Choose H := UNIT and K := UNIT} 

( UNITy/Fy/ UNIT => X . UNIToFo UNIT) 

= {Axiom about UNIT (1)} 

X .F □ 



Proposition 6 



exist. {£' .X) 
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Proof: 

exist. {E' .X) 

= {Definition of existential properties} 

(VF, G : Fy/G : £' .X . F V £' .X . G => £' .X . FoG) 

= {Predicate calculus} 

(VF, G : F^G : £' .X . F ^ £' .X . FoG) 

A (VF, G : F^G : £' .X . G ^ £' .X . FoG) 

In order to prove these two proof obligations, we choose two components F 
and G such that Fy/G, and we prove first that 

£'.X .F ^ £'.X .FoG 



and then that 

£' .X .G ^ £' .X .FoG . 

(The two proofs are different, because y/ and o may not be symmetric.) 

E'.X.F A Fy/G 
= {Definition of £' (7)} 

(ViJ, K : Hy/Fy/K : X . HoFoK) A Fy/G 
=> {For FT' s.t. Gy/K' , replace K with GoK'} 

(Vi7, K' : Gy/K' A Hy/Fy/GoK' : X . HoFoGoK') A Fy/G 
= {Axiom about yj (2), using the shortcut} 

(Vi7, K' : Fy/G A Hy/FoGy/K' : X . HoFoGoK') A Fy/G 
=> {Modus ponens} 

(ViJ, K' : Hy/{FoG)y/K' : X . Ho(FoG)oK') 

= {Definition of £' (7)} 

E'.X.FoG 

E'.X.G A Fy/G 
= {Definition of £' (7)} 

(ViJ, K : Hy/Gy/K : X . HoGoK) A Fy/G 
{For FI' s.t. F['y/F, replace FI with Fl'oF} 

(ViJ', K : H'y/F A H'oFy/Gy/K : X . H'oFoGoK) A Fy/G 
= {Axiom about y/ {2), using the shortcut} 

(Vi7', K : Fy/G A H'y/FoGy/K : X . H'oFoGoK) A F^G 
{Modus ponens} 

(Vi7', K : H'y/{FoG)y/K : X . H'o(FoG)oK) 

= {Definition of £' (7)} 

E'.X.FoG 



□ 
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This completes the proof that E' .X is solution of equation (8). We now prove 
that any other solution of (8) is stronger than E' .X. 

Proposition 7 exist. X = \E' .X = X] 

Proof: 

[E'.X = X] 

= {From proposition 6} 

[E'.X = X]^ exist.{E'.X) 

{Leibniz} 
exist. X 

I =;> I Assume exist.X, prove that X . F = E' .X . F, for any F. 

X.F 

= {Introduce and expand exist.X} 

X .FA (ViJ, G : Hy/G : X . H W X . G ^ X . HoG) 

=> {Choose G := F} 

X .FA (ViJ : Hy/F : X . H V X . F ^ X . HoF) 

=> {Modus ponensj 

(ViJ : Hy/F : A . HoF) 

= {Introduce and expand exist.X} 

(ViJ : Hy/F : A . HoF A (VG, K : Gy/K : A . G V A . A ^ A . Go A)) 

=> {Choose G := HoF} 

(VA : Hy/F : A . HoF A (VA : HoFy/K : A . HoF V A . A ^ A . HoFoK)) 
= {Move VA outside} 

(VA, A : Hy/F : X . HoF A{HoFy/K {X . HoFM X . K ^ X . HoFoK))) 

{Predicate calculus, VA, A is monotonic} 

(VA, A : Hy/F A HoFy/K : X . HoFoK) 

= {Axiom about y/ (2), introducing the shortcut} 

(VA, A : Hy/Fy/K : A . HoFoK) 

= {Definition of £' (7)} 

E'.X .F 

Since \E' .X ^ A] (prop. 5), this completes the proof of A . A = E' .X . F. □ 

Proposition 8 {E' is universally conjunctive) For any set S: 

[f'.(VA : A G 5 : A) = (VA : A G S' : E'.X)] . 
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Proof: 

£'.{yX : X G S : X) .F 
= {Definition of £' (7)} 

(ViJ, K : H^F^K : (iX : X G S : X) . HoFoK) 

= {Predicate calculus} 

(ViJ, K : Hy/Fy/K : (iX : X G S : X . HoFoK)) 

= {Interchange of universal quantifiers} 

{yX ■. X G S : (ViJ, K : H^F^K : X . HoFoK)) 

= {Definition of £' (7)} 

{yX :X G S: S' .X . F) 

= {Predicate calculus} 

{yX :X G S :£'.X).F □ 

Proposition 9 {£' is monotonic) 

[X ^ F] ^ [S'.X => S'.Y] 

Proof: 

= {Predicate calculus} 

[X = X A F] 

{Leibniz} 

[S' .X = S' .{X AY)] 

= {S' is conjunctive from prop. 8} 

[S' -X = S' -X A S' .Y] 

= {Predicate calculus} 

[S'.X => S'.Y] □ 

Proposition 10 For any property X , there exists a weakest existential property 
stronger than X and it is S' .X . 

Proof: From propositions 5 and 6, we know that S' .X is solution of equation 
(8). It remains to show that any solution Z of (8) is stronger than S' .X\ 

[F X] A exist. Z 

= {[S' .Z = Z] from prop. 7} 

[F ^ X] A [S'.Z = Z] 

=> {S' is monotonic from prop. 9} 

[S'.Z => S'.X] A [S'.Z = Z] 

{Leibniz} 

[F S'.X] □ 

Proposition 11 [S = S'] 



Proof: From the uniqueness of a weakest element, when it exists. 



□ 
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4.4 Relationship with ^^guarantees'''’ 

Proposition 12 [X guarantees Y = S' .{X y)] 

Proof: 

X guarantees Y . F 
= {Definition of guarantees (def. 5)} 

(ViJ, K : H^F^K : X . HoFoK ^ Y . HoFoK) 

= {Predicate calculus} 

(ViJ, K : H^F^K -.{X^Y). HoFoK) 

= {Definition of S'} 

S'.{X^Y).F □ 

Corollary: exist.{X guarantees Y) 

4.5 A Property Transformer U? 

The reason why, for any property A, there exists a weakest existential prop- 
erty stronger than X, which allowed us to define the property transformer S, is 
that any disjunction of existential properties in existential. This is not true for 
universal properties. Indeed, we can demonstrate a property X such that there 
is no unique weakest universal property stronger than X . This fact prevents us 
from defining a property transformer U in the same manner as we defined S. 

Proposition 13 (No U transformer) There exists nontrivial models in which 
some properties do not have a unique weakest universal property stronger than 
them. 

Proof: To prove this claim, we use the model of bags of colored balls and we 
consider the property P defined by: 

Po = (all balls are white), 

P\ = (all balls are black), 

P = Po V Pi . 

Clearly, Pq and Pi are universal and, if black and white balls exist at all, P 
is not. Let W, if it exists, be the weakest universal property stronger than P. 
Then: 

[Pq W] because Pq is universal and stronger than P, 

[Pi^W] because Pi is universal and stronger than P, 

\P ^ W] from the two above, 

\W P] by definition of W, 

[W = P] from the two above. 



But W is universal (by definition) and P is not, which leads us to a contradiction. 
Therefore, no such W exists. □ 
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4.6 Example 

Let’s consider the following question: In the world of bags of colored balls, what is 
the property: (at least 1 red ball) guarantees (at least 2 colors)? Obviously, the 
answer is: £.((at least 1 red ball) ^ (at least 2 colors)), but can we find a sim- 
pler, equivalent, formulation? 

In this part, we prove the following equivalence: 

[£.((at least 1 red ball) ^ (at least 2 colors)) = (at least 1 non red ball)]^. (9) 



We introduce two specific properties, UNIT= (“being the unit”) and its nega- 
tion UNIT^ (“not being the unit”): 

UNIT= .F = {F = UNIT), 

UNIT^ .F = {F ^ UNIT) . 

We will prove equivalence (9) by first proving the following proposition: 
Proposition 14 -^exist. UNIT^ V [X = UNIT^] V [£.{UNIT= V X) = £.X] 

Proof: 

Assume exist.UNIT^ and ^[X = UNIT^\, and prove [£.{UNIT^ V X) = £.X\. 
First case: F yf UNIT 

£ .{UNIT^ \/ X) . F 

= {[£ = £'] from prop. 11, definition of £' (7)} 

(ViJ, K : H^F^K : ( UNIT= V X) . HoFoK) 

= {From hypotheses F yf UNIT and exist. UNIT^, -^{UNIT- . HoFoK)} 

(ViJ, K : Hy/Fy/K : X . HoFoK) 

= (Definition of £' (7), [£ = £'] from prop. 11} 

£.X.F 

Second case: F = UNIT 
£.{ UNIT= V X) . UNIT 

= {exist.{£ .Y) from prop. 3, basic rules of existential properties} 

[£.{UNIT=\J X) = true] 

= {[£.true = true] and [£.Y ^ Y] from prop. 2} 

[UNIT^y X = true] 

= (Predicate calculus {UNIT is the only component where X may not hold)} 
[X = true] V [X = UNIT^] 

= {Hypothesis ^[X = UNIT^]} 



^ This example also shows that £ is not disjunctive, i.e., £.X V £.Y is, in general, 
strictly stronger that £.{X V T). 
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[X = irwe] 

= {\8.true = true\ and \E.Y Y] from prop. 2} 

[£.X = tru8\ 

= {exist.{£ .Y) from prop. 3, basic rules of existential properties} 

£.X . UNIT □ 

We can now apply proposition 14 to prove formula (9): 

£.((at least 1 red ball) => (at least 2 colors)) 

= {Definition of implication} 

f .((no red balls) V (at least 2 colors)) 

= {Nonempty bags contain balls: UNIT= V (at least 1 ball) = true} 
£.{{{UNIT= V (at least 1 ball)) A (no red balls)) V (at least 2 colors)) 

_ J Simplify using (at least 1 non red ball) = 1 

{((at least 1 non red ball) A (no red balls)) V (at least 2 colors) j 

£.{UNIT- V (at least 1 non red ball)) 

_ J exist. UNIT^ in bags model, assume there exist red balls, hence 1 

~ {^[(at least 1 non red ball) = UNIT^], apply prop. 14 j 

£.(at least 1 non red ball) 

= {exist.{at least 1 non red ball)} 

(at least 1 non red ball) 

In other words, to ensure that any system that uses a certain component F 
will have at least two balls of different colors provided that it contains at least 
one red ball, it is sufficient and necessary that the component F contains at least 
one non red ball. This is consistent with our intuition. 

Note that instead of guessing the desired property of F and then prove that 
any system composed from F satisfies (at least 1 red ball) => (at least 2 colors), 
we calculate the desired property of F. This provides us with both the property 
and the proof at the same time, and avoids dealing explicitly with the universal 
quantification over components. 



5 Related Work 

5.1 Existential/Universal versus Assumption-Commitment 

Traditional '''' as sumption- cornmitmenU (or ''''rely- guarantee") approach to com- 
position of concurrent systems [1,2,13,17,10,8,9,12] relies on an explicit specifi- 
cation of a component’s possible environments. It is defined in terms of “open 
system computations” , in which some steps are labeled “environment steps” . In 
some sense, components are “prepared” to be composed, by leaving room for 
interaction with the outside world. Interaction with the environment is present 
from the start. If nothing is specified about the environment, few component 
properties can be proved. Properties like always are meaningless in this context. 
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The approach presented here is the dual of assumption-commitment. In con- 
trast to much of the work in assumption-commitment, we deal with properties 
of components, not components coupled to specific environments. There are no 
“environment steps” in our theory. Indeed, we want to deal with systems in 
which steps and computations may not exist. So, we do not use automata-based 
models or process models nor do we assume specific forms of computations such 
as open-system computations. 

In our approach, we do not study composition in terms of specific languages 
or logics such as TLA [1,2], linear temporal logic [13,12], Unity [17,10,8,9], CSP, 
or process algebras. Instead, we deal with component properties and composition 
in the abstract. We postulate simple rules for the composition operator (such 
as associativity and the existence of a unit element), and then base a theory on 
these rules. 

A potential weakness of our approach is that by exploiting only theorems that 
we can prove from a limited set of rules we obtain less useful results than by 
working with a specific programming language and its associated logic (say TLA 
or Unity). Despite this weakness, we believe that explorations such as these can 
help to identify the relationships between central results about compositional 
design and the postulates about composition and component properties. 



5.2 The Benefits of “gimrantees” 

Existential and universal properties, as well as the guarantees operator were first 
introduced in [4] (with slightly different definitions). The guarantees operator 
has several advantages compared to corresponding operators in the assumption- 
commitment theory. 

Firstly, because guarantees properties do not reference an environment, but 
only deal with component and (global) system properties, they avoid a well- 
known circularity problem [1,3] (due to the fact that components are environ- 
ments of each other). The price we pay is that we cannot assume a property X 
of the environment to prove the same property X of the system, and such as- 
sumptions have proved to be useful in assumption-commitment specifications. 
To describe such behavior, we cannot use guarantees and instead we rely on 
universal properties (see, for instance, examples in [6]). 

Secondly, because X guarantees Y is existential, regardless of the proper- 
ties X and U, guarantees can be used with progress properties in its left-hand 
side. Indeed, system proofs can be simplified considerably by using guarantees 
properties with progress properties on the left-hand side [7] . For instance, a useful 
property of a distributed resource manager is: All clients eventually return the 
resources they are given guarantees the server eventually satisfies all requests 
from clients. By contrast, much of the literature on assumption-commitment 
specifications deals only with assumptions that are safety properties [1,2]. 

An advantage of having progress properties on the left-hand side of guaran- 
tees is that component designers can prove complex (progress) properties that 
can be used directly in proving composed systems. 
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5.3 S and S' versus guarantees” 

In [4] , guarantees properties are the basic elements of the theory. In this paper, 
we rely on £ instead. This use of predicate transformers leads to more concise 
definitions and simpler proofs. 

In particular, a simple but important theorem is that X guarantees Y is 
merely the application of predicate transformer £ to the property X ^ Y. This 
theorem explains why guarantees is easy to reason with (it behaves like impli- 
cation) and useful in specifying components (it does not strengthen implication 
too much). 

Two equivalent formulations of £ are given in this report. The first is in 
terms of an extreme solution to an equation in predicates. This form is useful 
to deduce theorems about existential properties and guarantees. The second 
form uses an explicit quantification over components. This form has been useful 
in deducing proof rules for guarantees properties in the context of concurrent 
programs, namely Unity logic. (These rules are not given here.) 

5.4 Predicate Transformers and Universal Properties 

In section 4.5, we proved that, for some properties, there does not exist a weak- 
est universal strengthening. Consequently, it is impossible to define a predicate 
transformer U that would be to universal composition what the transformer £ is 
to existential composition. 

Universal properties are useful where guarantees properties cannot be used. 
However, elementary properties like always are not universal. Therefore, we are 
interested in studying universal properties that are stronger than specific prop- 
erties such as always. Invariant is one possible strengthening of always, but in- 
variant properties turn out to be too strong. Intermediate universal properties, 
between invariant and always, can be defined [9,17], but we need to experiment 
with using these properties on proving systems to determine if they have the 
appropriate level of abstraction. The existence of a weakest universal property 
stronger than always is still an open question. 

6 Conclusions 

Component technology allows designers to develop systems by composing sub- 
systems. Compositional design is simpler than designing systems from scratch 
if designers have access to large libraries of components, provided appropriate 
components in the library can be discovered easily, and there are simple rules for 
deducing system properties from component properties. This paper reports on 
an ongoing exploration of rules for deducing system properties from component 
properties. 

We have explored properties of composition that can be deduced from al- 
gebraic properties of the composition operator: associativity and the existence 
of a unit component. Though the exploration reported in this study is abstract 
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and is independent of specific types of systems, we are particularly concerned 
with applying the results to parallel and sequential composition of programs. 
Since Hoare triples and weakest preconditions handle sequential composition 
very well, the practical results of our explorations are particularly important 
for concurrent composition. The concurrent composition operator, as defined in 
Unity, is symmetric (commutative) and idempotent in addition to being as- 
sociative. This allows us to prove additional theorems about 8 for this special 
case and, for instance, to deduce proof rules for guarantees that are specific 
to the Unity model. We have applied these rules and the theorems discussed 
here to the compositional design of several message-passing programs [7] and 
shared-memory programs [6]. 

Requiring components to be specified in terms of conjunctions of existential 
and universal properties may seem too restrictive. We have found, however, that 
this requirement is at the appropriate level of specificity for compositional design. 
Existential properties are a surprisingly rich class of properties, especially since 
many properties can be expressed as guarantees properties. 

For example, we specify a first-in first-out single input, single output, mes- 
sage channel by the characteristic that the sequence of messages output “fol- 
lows” [5,16,7] the sequence of messages input. More precisely, such a specifi- 
cation is a guarantees property whose right-hand side is the conjunction of an 
always property (the output sequence is a prefix of the input sequence) and a 
progress property (any prefix of the input sequence is eventually a prefix of the 
output sequence). 

We have shown that there are some properties X for which there does not 
exist a weakest universal property stronger than X. For some properties of in- 
terest in specific domains, however, there may exist weakest universal properties 
stronger than them. For example, there may exist a weakest universal property 
stronger than any always property. We wish to explore such strengthening prop- 
erties because properties such as always are important for the specification of 
reactive systems. 
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Abstract. We present an algorithm for inverse computation in a first- 
order functional language based on the notion of a perfect process tree. 
The Universal Resolving Algorithm (URA) introduced in this paper is 
sound and complete, and computes each solution, if it exists, in finite 
time. The algorithm has been implemented for TSG, a typed dialect of 
S-Graph, and shows some remarkable results for the inverse computa- 
tion of functional programs such as pattern matching and the inverse 
interpretation of While-programs. 



1 Introduction 

While standard computation is the calculation of the output of a program for a 
given input (‘forward execution’), inverse computation is the calculation of the 
possible input of a program for a given output (‘backward execution’). Inverse 
computation is an important and useful concept in many different areas. Ad- 
vances in this direction have been achieved in the area of logic programming, 
based on solutions emerging from logic and proof theory. 

But inversion is not restricted to the context of logic programming. Re- 
versibility is an important concept in any programming language, e.g., if one 
direction of an algorithm is easier to define than the other, or if both di- 
rections are needed (c/. encoding and decoding). Interestingly, inversion has 
spanned relatively little interest in the area of functional programming (excep- 
tions are [5,9,18,20,21,25]), even though it is an essential concept in mathematics. 

We distinguish between two approaches for solving inversion problems: an 
inverse interpreter that performs inverse computation and an inverse compiler 
that performs program inversion. Determining for a given program P and output 
y an input x of P such that |p]x = y is inverse computation. A program that 
produces P^^, is an inverse compiler (also called program inverter). Using P^^ 
will then determine input x of P. 

* On leave from DIKU, Department of Computer Science, University of Copenhagen. 



R. Backhouse and J. N. Oliveira (Eds.): MPC 2000, LNCS 1837, pp. 187—212, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 



188 Sergei Abramov and Robert Gliick 



P y 

inv-int 



X 




(1) Inverse interpreter 



(2) Inverse compiler 



As shown in [3,4] , inverse computation and program inversion can be related 
conveniently using the Futamura projections known from partial evaluation: a 
program inverter is a generating extension of an inverse interpreter. In the re- 
mainder of this paper we shall focus on inverse computation. 

As example of inverse computation, consider a pattern matcher which takes 
two strings as input, pat and str, and returns SUCCESS if pat is a substring of 
str; FAILURE otherwise. For instance, computation with pattern "BC" and string 
"ABCD" returns SUCCESS, and the same string with pattern "CB" returns FAILURE. 



match [ "BC", "ABCD" ] ^’SUCCESS 
match [ "CB", "ABCD" ] ^’FAILURE 



standard computation 



Given string str, we may want to ask inverse questions such as: Which patterns 
are substrings of str, or which patterns are not substrings of str? To compute the 
answer, we can either implement new programs, in general a time consuming and 
error prone task, or we can use an inverse interpreter ura to extract the answer 
from the original program. We do so by fixing the output to SUCCESS (or FAILURE) 
and the string to str, while leaving the pattern unspecified (placeholders Xi, A 2 ). 

ura [ match, [Ai , "ABCD"], ’SUCCESS ] :^ansi 

* inverse computation 

ura [ match, IX 2 , "ABCD"], ’FAILURE ] ^ anS2 

The answer tells us which values the placeholders may take. In general, com- 
putability of the answer is not guaranteed, even with sophisticated inversion 
strategies. Some inversions are too resource consuming, while others are unde- 
cidable. When a program is not injective in the missing input, the answer can 
either be universal (all possible inputs) or existential (one of the possible inputs). 
We will only consider universal solutions, hence the name for our algorithm. 

Most of the earlier work on this topic {e.g., [5,6,7,16,17]) has been program 
transformation by hand: specify a problem as the inverse of an easy computation, 
and then derive an efficient algorithm by manual application of transformation 
rules. By contrast, our approach aims for mechanical inversion. The first observa- 
tion [4] is that to do this, it suffices, in principle, to stage an inverse interpreter: 
via the Futamura projections this will give an inverse compiler. This is conve- 
nient because inverse computation is simpler than program inversion. The second 
key idea is to use the notion of a perfect process tree [12] to systematically trace 
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the space of possible execution paths by standard computation, in order to find 
the inverse computation. 

The Universal Resolving Algorithm (URA) introduced in this paper is sound 
and complete, and computes each solution, if it exists, in finite time. The algo- 
rithm has been designed for a first-order functional language with S-expressions 
as data structures. However, the principles and methods developed here are not 
limited to this language, but can be extended to other programming languages. 
The main contributions in this paper are: 

— an approach to inverse computation, its organization and structure, 

— a formal specification of a Universal Resolving Algorithm for a first-order 
functional language based on the notion of a perfect process tree, 

— an implementation of the algorithm and experiments with inverse computa- 
tion of programs such as pattern matchers and interpreters, 

— a constructive representation of sets of S-expressions allowing operations 
such as contractions and perfect splits. 

The paper is organized as follows. In Section 2 we formalize a set repre- 
sentation of S-expressions and in Section 3 we define our source language. A 
program-related extension of the set representation is introduced in Section 4. 
Sections 5-7 present the three steps to inverse computation. Implementation and 
experiments are discussed in Section 8 and 9. We conclude with a discussion of 
related work in Section 10 and future work in Section 11. 

2 A Set Representation of S-Expressions 

This section introduces the basic notions needed for inverse computation using a 
source language with S-expressions. In particular, we define a set representation 
of S-expressions and related operations such as substitution and concretization, 
contraction and splitting. 

A simple and elegant way to represent subsets of a value domain is to use 
variables, expressions with variables and restrictions on variables. Let us consider 
an example from mathematics. The definition of a set of 3D-points 

P = {{x,y,x + y) I a; > 0, 2 / > a; } 

is expressed by means of (i) variables x and y (typed variables, in fact: it is 
assumed that x and y range over the set of reals), (ii) expression {x,y,x + y) 
with variables, and (iii) restrictions a; > 0 and y > x on variables. We will use 
the same approach for representing sets of S-expressions and introduce similar 
notions: c-variables, c-expressions and restrictions. 

2.1 S-Expressions 

We use S-expressions known from Lisp as value domain for our programs. The 
syntax of S-expressions is given by the grammar in Fig. 1. Values are build 
recursively from an infinite set of symbols using atom and cons as constructors. 
A value d € Dval is ground. We will use ’z as shorthand for (atom z). 
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CAexp 






Xe 


€ CEvar 


X G 


Cvar 






Xa 


G CAvar 


z G 


Symb 






Fig. 1 


S-expressions and c-expressions 





2.2 Representing Sets of S-Expressions 

Expressions with variables, called c- expressions (Fig. 1), represent sets of S-ex- 
pressions by means of two types of variables: ea- variables Xa and ce-variables Xe, 
where variables Xa range over DAval, and variables Xe range over Dval. To fur- 
ther refine our set representation we introduce restrictions on variables (Fig. 2). 
A restrietion is a set of inequalities defining a set of values a ca- variable Xa must 
not be equal to. An inequality can be expressed between ca- variables and atoms. 

Finally, we form pairs of c-expressions and restrictions, short cr-pairs (Fig. 2). 
This will be our main method for representing and manipulating sets of S- 
expressions in a constructive way. These structures may contain c-variables and 
for notational convenience we indicate this by notation . 

Definition 1 (c-expression). A c-expression is an expression d G Cexp as 
defined in Fig. 1. By var(d) we denote the set of all c-variables occurring in d. 

Definition 2 (c-construction). A c-expression is a c-construction cc € Ccon. 
We define Ccon = Cexp.^ 

Definition 3 (inequality, restriction). An inequality ineq G Ineq is an un- 
ordered pair {dai # da 2 ) with da\, da 2 G CAexp, or the symbol contra. A restric- 
tion r G Restr is a finite set of inequalities. By var(r) we denote the set of all 
ca-variables occurring in r. 

Definition 4 (tautology, contradiction) . A tautology is an inequality of the 
form (dai # da 2 ) G Ineq where da\, da 2 are ground and da\ da 2 . A contradic- 
tion is either an inequality of the form (da # da) G Ineq or the symbol contra. By 
Tauto and Contra we denote the set of tautologies and the set of contradictions, 
respectively. 

^ In Sect. 4 we will extend the definition of domain Ccon with program-related con- 
structions: c-state 's, c-environment a, etc. 
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cr ::= (cc,r) 


r ::= ineq* 


cc ::= d (see Fig. 1) 


ineq {da # da) \ contra 


Value Domains 




cr G CRpair 


r G Restr 


cc G Ccon 


ineq G Ineq 


Fig. 2. CR-pairs 


and restrictions 



Definition 5 (cr-pair). A cr-pair cr € CRpair is a pair (cc,r) where cc G 
Ccon is a c- construction and r G Restr is a restriction. By var(cr) we denote 
the set of c-variables occurring in cr: var(cr) = var(cc) U var(r) . 

Example 1. The following expressions are cr-pairs: 

cri = ((cons Xa (cons Xe ’Z)),0) 

ct 2 = ((cons Xa (cons Xa ’Z)),(lj) 

era = ((cons Xa (cons Xa 'Z)), { {Xa # ’A) }) 

ct 4 = ((cons Xa\ (cons Xa 2 ’Z)), { {Xa\ # Xa 2 ) }) ■ 



2.3 Substitution and Concretization 

We now define substitution and concretization. The semantics of applying a 
substitution 9 to a cr-pair cr is defined in Fig. 3. Substitution will be used to 
define concretization \ cr] , namely the set of S-expressions represented by cr. 

Definition 6 (substitution). A substitution 9 = [Xi di,... i— > d„] is 

a sequence of typed bindings such that c-variables Xi are pairwise distinct, di are 
c- expressions, and Xi G CAvar implies di G CAexp, * = 1 . . .n. Substitution 9 
is ground if all di are ground. By dom{9) we denote the set {Xi, ... , Xn} . 

Definition 7 (substitution on c-construction). Let cc G Ccon be a c-con- 
struction, and let 9 = [Xi i—^di,... , A„ i— > dn] G CCsub be a substitution, then 
the result of applying 9 to cc, denoted cc/9, is the c-construction obtained by 
replacing every occurrence of Xi in cc by di for every Xi di in 9. 

Definition 8 (full substitution). Let cc be a c-construction (or restriction, or 
cr-pair), let 9 be a substitution. Then 9 is a full substitution for cc ijf 9 is ground 
and var(cc) C dom{9). By FS{cc) we denote the set of all full substitutions for cc. 

Definition 9 (substitution on restriction). Let 9 G CCsub, let r G Restr, 
then the result of applying 9 to r, denoted r/9, is defined by 

''/Q / {contra} ifr' C Contra 0 

' \ \ Tauto otherwise, 

where r' = { ineqj9 \ ineq G r } . 
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CR-Pair: 

C-Expression: 



{ct,r)/9 = {ccl9,rl9) 



X/9 



9{X) ifXedom{9) 
X otherwise 



Inequality: 



Restriction: 



(atom z)/9 
(cons di da)/ 9 

contra /9 

{da\ # da2)/9 



r/9 



(atom z) 

(cons di/9 d2/9) 

contra 

{dai/9 # da2/9) 

( {contra} if 9 ' n Contra 7 ^ 0 
{ \ Tauto otherwise, 

where r ' = { ineq/9 \ ineq € r } 



Fig. 3. Definition of substitutions cr/9, d/6, ineq/9 and r/9 



The definition says that the result of applying a substitution 0 to a restriction r is 
either a contradiction, which means it is impossible to satisfy the new restriction, 
or a new set of inequalities from which all tautologies have been removed.^ 

Let ineq be an inequality such that var(ineq) = 0. According to Def. 4 we 
have: ineq is either a tautology or a contradiction. This fact allows us to prove 
the following proposition. 

Proposition 1. Let r € Restr be a restriction and let 9 € FS{r) be a full sub- 
stitution for r, then either ~r /9 = % orF/9 = { contra } . 

Definition 10 (substitution on cr-pair). Let cr = {cc,r) G CRpair be a 
cr-pair and 9 G CCsub be a substitution, then the result of applying 9 to cr, 
denoted cr/9, is defined by 

cr/9 {cc/9,r/9) . 

Definition 11 (cr-concretization). The set of data represented by cr-pair 
(cc,r) G CRpair, denoted |"(cc,r)], is defined by 

[(cc,r)] {cc/9 I 9 G FS{{cc,f)),r /9 = %} . 

Example 2. The cr-pairs from Example 1 represent the following sets of values: 

[cri] = { (cons da (cons d ’Z)) | da G DAval, d G Dval } 

[cr 2 ] = { (cons da (cons da ’Z)) \ da G DAval} 

[era] = { (cons da (cons da ’Z)) \ da G DAval, da ^ ’A} 

|"ct 4 ] = { (cons dai (cons daa ’Z)) \ dai, daa G DAval, dai yf daa } ■ 

^ Even though from a formal point of view it is not necessary to remove all tautologies, 
it is convenient to check for empty set after applying a full substitution (c/. Prop. 1). 
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2.4 Contraction and Splitting 

To narrow the set of values represented by a cr-pair, we introduce contractions. 
A contraction k is either a substitution 9 or a restriction r. A split is a pair 
of contractions (ki,K2) that partitions a set of values into two disjoint sets. A 
perfect split guarantees that no elements will be lost, and no elements will be 
added when partitioning a set. 

Definition 12 (contraction). A contraction k G Contr is either a substitution 
0 G CCsub or a restriction r G Restr. 



Definition 13 (contracting). The result of contracting cr-pair {cc,f) GCHpair 
by contraction k G Contr, denoted {cc,r)/K, is a cr-pair defined by 

, ^ , def f (cc,r)/K if CCsub 

(cc,r) K = < -V, \ r ^ -D r 

' " ((cc,rUK) At € Restr . 



For notational convenience we also define 



r/n 



def 



rjK 
r U At 



if K G CCsub 
if K G Restr . 



It is easy to show that [cf /At] C \cr~\ for all cr G CRpair and for all At G Contr. 
That is, a contraction At does never enlarge the set represented by a cr-pair. 

Definition 14. Define two special contractions: identity Atij [ ] G CCsub and 
contradiction Atcontra '=^ {contra} G Restr. 

It is easy to show that for all cr G CRpair: 



[cf/Atid] = [cf] and [cf /Atcontra] = & ■ 

Definition 15 (split). A split sp G Split is a pair (ki,K 2 ) where Ati, At2 G Contr. 

Definition 16 (perfect splitting). A split {ki,K 2 ) G Split is perfect for cr G 
CRpair if (Ati,At2) divides \cr~\ into two disjoint sets |"cr/Ati] and |"cf/At2] such 
that 

[cf/Ati] U |"cf/At2] = |"cf] and \cr/K,{\ fl |"cf/At2] = 0 . 

Theorem 1 (perfect splits). For all cr-pairs {cc,r) G CRpair the following 
four splits are perfect: 

(^idi rtcontra) 

2. {[Xai I— > da], {{Xai # da)}) 

3. {[Xai^Xa2], {{Xai # Xa2)}) 

4- ([-^63 I— > Xa‘^], [Xc'i ^ (cons Xe% A'e/)]) 

where Xai, Xa 2 , Xes G var(cc), Xa^, Xef, Xe^ ^ var(cc) U var(r), da G DAval. 
Remark: we use notation * to denote fresh c-variables for (cc,r). 



Proof: Omitted. 



□ 
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£ Pexp 


xa £ PAvar 


k £ Cond 


ea 


£ PAexp 




Fig. 4. Abstract syntax of typed S-Graph (TSG) 



3 Source Language 

We consider the following first-order functional language, called TSG, as our 
source language. The language is a typed dialect of S-Graph [12]. The syntax of 
TSG is given by the grammar in Fig. 4; the operational semantics is defined in 
Fig. 5. An example program in concrete syntax is shown in Fig. 13. This family of 
languages has been used earlier for work on program transformation [2,11,12]. 

Syntax. A TSG-program is a sequence of function definitions where each def- 
inition contains the name, the parameters and the body of the function. The 
body of a function is a term which is either a function call call, a conditional if, 
or an expression e. Values can be constructed by atom, cons, and tested and/or 
decomposed by eqa?, cons?. Variables xa range over atoms, variables xe range 
over arbitrary values. The language is restricted to tail-recursion. 

We assume that every TSG-program p we consider is well-formed in the sense 
that every function name that appears in a call in p is defined in p, that the 
types of arguments and parameters are compatible, and that every variable x 
used in the body of a definition g is a parameter of q or defined in an enclosing 
conditional. The first definition in a program is called main function. A program 
p is represented by a program map F which maps a function name / to the 
corresponding definition in p. 

Semantics. The evaluation of a term updates a program’s state {t, a) which 
consists of a term t and an environment a. The meaning of each term is then a 
state transformation computing the effect of the term on the state. We consider 
the input of a program to be the arguments of a call to the program’s main 
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Condition Eqa? 

eaija = 002 jo ea\ja 7 ^ 002 jo 

o \~if (eqa? eai 002) ti t2 => (ti,o) a h/ (eqa? eai 002) t\ ±2 => (t 2 ,cr) 

Condition Cons? 

ejo = (cons di ^2) o' = o[xei 1— > di,a;e2 1— > ^2] 
o \~if (cons? e xe\ XC2 xas) ti t2 ^ (h, o') 



ejo = (atom z) o' = cr[2;a3 1— > e/u] 
o \~if (cons? e xei XC2 xas) ti t2 => (t2, o') 



Terms 



o \-jf k ti t 2 ^ {tj,o') ^£{1,2} 

hr ((if k ti t 2 ),o) => (ti,o') 



r{f) = (define f xi ... x„ t) o' = [xi e\jo, ■■■ ,x^r^ e-njo\ 
hr ((call / ei . . . e„), cr) (t, o') 



Transition 

\~r s ^ s' 
hr s — > s' 

Semantic Values 

s £ PDstate = Term x PDenv 
o £ PDenv = (Pvar x Dval)* 
r £ ProgMap = Fname^ Definition 

Fig. 5. Operational semantics of TSG-programs 



function, and the output of a program (if it exists) to be the value returned by 
evaluating this call. The semantics of TSG relies on the following definitions. 

Values and variables. Values d € Dval are defined by the grammar in Fig. 1. In 
addition, we use tuples of values ds = [di, . . . , d„] as input for programs (0 < n). 
The set of all value tuples will be denoted by Dvals. A program contains two 
types of variables x £ Pvar. Variables xa £ PAvar range over DAval, variables 
xe £ PEvar range over Dval. Recall that DAval C Dval. 

Environment. An environment a = [xi 1— *■ di, . . . , 1— *■ d„] £ PDenv is a se- 

quence of typed bindings such that variables Xi are pairwise distinct, ck are 
values, and Xi £ PAvar implies di £ DAval (i = 1 . . . n). An environment a holds 
the values of the program variables. 

We write a[x ^ d] to denote the environment that is just like a except that 
variable x is bound to value d, and we write a{x) to denote the value of a; in ct. 
Let e £ Pexp and a £ PDenv, then e/a denotes the value of e on a defined as 
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the result of replacing every variable x occurring in e by value a{x). If a program 
is well-formed, then a in the rules of Fig. 5 defines a value for every x in e. 

State. A state s = (t, cr) € PDstate is a term-environment pair that represents 
the current state of computation. A state of the form s = (e, a) with e G Pexp 
is a terminal state; otherwise s is a non-terminal state. 

Evaluation. Figure 5 defines a transition relation — > between states. The rules are 
straightforward. The rule for call states that a call to a function / returns a new 
state (t, cr') that contains the body t of /’s definition and a new environment a' 
that binds each parameter Xi of / to the value obtained by eija . 

The rule for if states that, depending on the evaluation of condition k under 
environment cr, a new state {ti^a') is returned that contains one of the two 
branches t\ or t^, and an updated environment a' . 

The two rules for eqa? state that, depending on the equality of values eoi/cr 
and 602 /cr, a new state is formed containing term t\ or ^ 2 , and unchanged en- 
vironment cr. The two rules for cons? state that, depending on value e/cr, a 
new state is formed containing term t\ or t 2 , and an updated environment cr'. 
If value e/cr has outermost constructor cons, environment cr is extended with 
variables xei, XC 2 bound to head and tail component of the value, respectively. 
Otherwise, environment cr is extended with variable xas bound to atom e/cr. 

Finally, the T-indexed transition relation — >/- C (PDstate x PDstate) defines 
a transition from a state s to a state s' in a program represented by program 
map r. Even though the rule’s formulation in Fig. 5 is trivial, we keep it for 
later extension. We write in infix notation and drop the T-index when it is 
clear from the context. For example, we write s s' when (s, s') € —^r ■ 

Definition 17 (program evaluation). Let p be a well-formed TSG-program 
with main function q = (define f x\ ... Xn t), and let ds = [di, . . . , d„] € Dvals. 
We define initial state s°{p, ds) fto^ao) where tg = (call / X\ . . . Xn) and 
CTo = [a;i > di, . . ■ , a;„ I— > dri]. We define program evaluation |-] as follows: 

Mds if s°{p,ds) (e,cr) 

( undefined otherwise . 



4 Program-Related Extension of the Set Representation 

We extend the set representation introduced in Sect. 2 to program-related con- 
structions needed for inverse computation of TSG-programs, such as state, en- 
vironment, and input. These notions are language dependent and relate to the 
operational semantics of TSG. 

First, we extend the definition of c-construction cc to include c- state s', 
c-hinding b, c-environment a, and c-input ds (Fig. 6). That is, domain Gcon 
(Def. 2) is extended to include all of these sets. Second, we extend the applica- 
tion of substitution to all program-related c-constructions (Fig. 7). Beside these 
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Fig. 6. A program-related extension of c-constructions 



estate: 


(t,a)/e = {t,a/6) 


C-Binding: 


(*1— >d)/d = (i 1— > d/0) 


C-Environment: 


[6i,... ,6„l/0 = [6i/0,... ,&„/0] 


C-Input: 


[di, . . . , d„]/0 = [di/e,. . . ,dn/9] 


Fig. 7. Substitution on program-related c-constructions 



extensions, all definitions and results from Sect. 2 remain valid. In particular, 
Thin. 1 (perfect splits) holds for the extended set of c-constructions. 

The extension of domain Ccon leads to new cr-pairs. A cr-pair containing a 
c-state s' is called configuration. A cr-pair containing a c-input ds is called class. 
Each of them represents a set of states and a set of value tuples, respectively. 



Definition 18 (class, configuration). A cr-pair {ds,r) where ds G Cexps is 
a class. A cr-pair {'s,f) where 's G PCstate is a configuration. By Class and 
Conf we denote the set of classes and the set of configurations, respectively. 



Definition 19 (well- formed input class, initial configuration). Let p he 

a well-formed TSG-program with main function q = (define / x\ . . .Xn t), and 
let els = ([di, . . . , dn],f) G Class. We say that els is a well-formed input class 
for p if [cZs] yf 0 and variable Xi G PAvar implies di G CAexp (i = 1 . . .n). We 
define initial configuration c°{p, els) {{to,ao),r) where els is a well-formed 
input class for p, to = (call f X\ . . . Xn) and do = [xi ^ d\, . . . , a;„ i— > d„] . 
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Fig. 8. Conceptual approach: three steps to inverse computation 



5 Three Steps to Inverse Computation 

Inverse computation can be organized into three steps: walking through a perfect 
process tree, then tabulating the input-output pairs, and finally extracting the 
answer to the inversion problem from the input-output pairs. 

The key idea used in our approach is based on the notion of a perfect process 
tree which represents the computation of a program with missing input by a 
tree of all possible computation traces. Each fork in the tree partitions the input 
class into disjoint classes. Our algorithm then constructs, breadth-first and lazily, 
a perfect process tree for a given program p and input class clsi„. We shall 
not be concerned with different implementation techniques, but with a rigorous 
development of the principles and foundations of inverse computation. 

In general, inverse computation using ura takes the form 

luraj[p,[cls^n,dout]] = ans 

where p is a program, els in is an input class, and dout the output. We say, tuple 
[clsim dout] is a request for inverse computation where class clsm specifies the set 
of admissible input (the search space), and dout is the fixed output. The set ans 
is a solution of the given inversion problem. It is a set of substitution-restriction 
pairs ans = {(0i, ?i), . . .} which represents the largest subset of \clsin] such that 
[p]dsi„ = dout for all elements {9i,fi) of the solution and dsm € \clsin]- More 
formally, a correct solution to an inversion problem is specified by 

[c/s iyj { ds in I ds in — dout } • 

i 

In the following sections we present each of the three steps: 

1. Perfect Process Tree: tracing program p under standard computation 
with els in- 

2. Tabulation: forming the table of input-output pairs from the perfect process 
tree and class els in- 

3. Inversion: extracting the answer for given output dout from the table of 
input-output pairs. 

The structure of the algorithm is illustrated in Fig. 5. Since our method is 
sound and complete, and since TSG is a universal programming language, which 
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follows from the fact that the Universal Turing Machine can be defined in it, we 
can apply inverse computation, in principle, to any computable function. Thus 
our method of inverse computation has full generality. 

The organization of inverse computation given here can be used for virtually 
any programming language. TSG is only a means to develop and fully formalize 
an algorithm for inverse computation. In fact, the set representation introduced 
in Sect. 2 can be used for any programming language with S-expressions, for 
example, for a subset of Lisp, or a simple flowchart language with S-expressions. 
Only the notions of state and configuration may change depending on the lan- 
guage. Changing the source language affects the construction of the perfect pro- 
cess tree, while the tabulation and inversion steps are not affected. 

6 Walking the Perfect Process Tree 

The transition relation in Fig. 9 defines walks through a perfect process tree [12]. 
Starting from a partially specified input, the goal is to follow all possible walks a 
standard evaluation may take under this generalized input. This will be the basis 
for inverse computation where the input of a program is only partially specified. 

Process tree. A computation process is a potentially infinite sequence of states 
and transitions. Each state and transition in a deterministic computation is fully 
defined. The set of computation processes captures the semantics of a program 
as a whole. A process tree is used to examine the set of computation processes 
when the computation is not deterministic (because the input is only partly 
specified). Each node in a process tree contains a set of states represented by a 
configuration. A configuration which branches to two or more configurations in 
a process tree corresponds to a conditional transition from one set of program 
states to two or more sets of program states. 

As defined in [12], a walk w in a process tree g is feasible if at least one initial 
state exists which selects w. A node n in a process tree g is feasible if it belongs 
at least to one feasible walk w in g. A process tree g is perfect if all walks in g 
are feasible. 

Role of perfectness. The two most important operations when developing a pro- 
cess tree are: 

1. applying perfect splits at branching configurations, 

2. cutting infeasible branches in the tree. 

Cutting infeasible branches is important because an infeasible branch is ei- 
ther non-terminating, or terminating in an unreachable node. The risk of enter- 
ing non-terminating branches makes inverse computation less terminating (but 
completeness of the solution can be preserved). A terminating branch leads to 
a terminal state that can only be associated with an empty set of input in the 
solution (but soundness of the solution is preserved). Short, the correctness of 
the solution can be guaranteed, but an algorithm for inverse computation be- 
comes less terminating and less efficient. The correctness of the solution cannot 
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Condition Eqt? 



eaxja = ea-ija 

a\~if (eqa? eai 002) ti t2 => ((ti,a),Kid) 



ea\/a 7^ 602/0" {eaija # 602/0) ^ Tauto k = [mkBmd(eai /a, 002/(7)] 
a hif (eqa? 60 i 6O2) ti t2 {{t\,a),K.) 



601/07^602/0 re = {(6O1/0 # 602/0)} 
oh/ (eqa? 6O1 6O2) ti t2 ((t2,o),re) 

Condition Cons? ^ ^ ^ 

6/0 = (cons di 6(2) o' = ohi di,X2 ^2] 

oh/ (cons? 6 xi 2:2 2:3) ti t2 ((ti,o'),reid) 

e/a = da o' = 0(2:3 1— » do] 
oh/ (cons? 6 2:1 2:2 2:3) ti t2 ^ ((t2,o'),reid) 



e/a = Xe o' = 0(2:1 1— » Xe/,X2 e- > A62] re = [Xe 1— > (cons Xe/ Xe^)] 
oh/ (cons? 6 2:1 2:2 2:3) ti t2 => ((ti,o'),re) 

e/a = Xe o' = 0(2:3 Ao*] k = (Ac 1— > Ao*] 
oh/ (cons? 6 2:1 X2 2:3) ti t2 ^ ((t2,o'),re) 

s 

o A: ti h =» ((tj,o'),re) i€{l, 2 } 

hr ((if k ti t2),a) {{U,a'),K) 



r{f) = (define f xi ... x„ t) a' = [xi 1-^ 61/0, ... ,Xn^ 6 n/o] 
hr ((call / 61 . . . 6n),o) ^ ((t, o'),reid) 



Transition 



hr ^ (s', re) r/re h { contra } 

hr (s,r) (s',r)/re 



Semantic Values 

's G PCstate = Term x PCenv 
o G PCenv = (Pvar x Cexp)* 
r G ProgMap = Fname^ Definition 
Fig. 9 . Trace semantics for perfect process trees of TSG-programs 



be guaranteed without applying perfect splits because in this case empty sets of 
input cannot be detected neither during the development of the tree nor in the 
solution. Our formulation of the transition relation includes both operations. 



Walking a process tree. Fig. 9 defines a transition relation 1— > between config- 
urations. The transition relation does not actually construct a tree, but allows 
to perform all walks in a perfect process tree. The transition relation is non- 
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deterministic when a condition (eqa?, cons?) cannot be decided. In this case 
the rules permit us to follow any of the two possible branches. 

The transition rule states that a configuration (s', r) is transformed into a 
new configuration which is obtained by evaluating c-state s' to a new c-state 's', 
and applying contraction k of the associated perfect split to configuration {'s' ,r) 
if this does not lead to a contradiction (which would mean the transition is not 
feasible). The rule ensures perfect splitting and cutting of infeasible branches. 

The rules for if and call are similar to the rules for the operational semantics 
in Fig. 5 except that they take a c-state to a new c-state and an associated con- 
traction K. In case of a call, identity contraction Kid is returned (no split), in case 
of a conditional, contraction k produced by evaluating condition k is returned. 

'We now describe the rules for conditions in more detail. The three rules for 
eqa? state that, depending on the equality of ca-expressions eai/a and call'd, a 
new c-state is formed which is associated with a contraction k. The first equality 
rule applies if ca-expressions eaija and call'd are equal, which means they 
represent the same set of values. The second and third rule may apply at the 
same time. This is the case when ca\!d and ca^ld are not equal and at least one 
of the two ca-expressions is a c- variable {i.c., inequality {ca\!d # ca^ld') is not a 
tautology). Then c-states (ti,d) and (t 2 , d) are associated with the corresponding 
contraction of the perfect split (Thm. 1, split 2, 3): (ti,d) is equipped with 
a substitution binding the ca- variable to the other ca-expression, and (t 2 ,d) 
is equipped with an inequality between ca\ld and ca^ld. Auxiliary function 
mkBind makes a binding of its arguments ensuring that a ca-variable appears 
on the left hand side of that binding. 

The four rules for cons? associate a new c-state with a contraction k. The 
first two rule correspond to the two cons rules in Fig. 5 except that e/ir is 
a c-expression. If e/a has outermost constructor cons then the true-branch is 
entered, otherwise, the false-branch is entered. In case e/ir is a ce-variable Xe, 
the third and fourth rule apply and c-states {ti,di) and (^ 2 ,^ 2 ) are equipped 
with the corresponding contraction of the perfect split (Thm. 1, split 4): (ti,di) 
is equipped with a substitution instantiating Xe to a new cons-expression (where 
Xci and are fresh ce- variables), and (^ 2 ,^ 2 ) is equipped with a substitution 
binding ce-variable Xe to a fresh ca-variable Xa^ . 



Correctness. Proving the trace semantics for perfect process trees (Fig. 9) correct 
wrt the operational semantics of TSG must consist of a soundness and complete- 
ness argument. First, we state the correctness of an initial configuration and a 
transition step, and then state the main correctness result. We shall not be con- 
cerned with the technical details of the proofs in this paper, only with the fact [2] 
that the trace semantics is correct wrt the operational semantics. 

Theorem 2 (correctness of initial configuration). Let p be a well-formed 
TSG-program, let els be well-formed input class for p, then 
Completeness and Soundness: \c°(j>, cZs)] = { s°(j>, ds) \ ds G |"ds] } . 
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Transition 


hr s' => (s', k) r/n A {contra} 
hr {els, {'s,?)) ^tab {els / K, /s' ,9) / k) 


Semantic Values 


tab G Tab = Class x Cexp 




Fig. 10. Tabulation of TSG-programs 



Theorem 3 (correctness of ppt-transition). Let p he a well-formed TSG- 
program, and let c he a well-formed configuration for p, then 
Completeness: VsS [c] .Vs' . (IVr s ^ s') =k (3c' . (H-p c c' A s' G |"c'])) 
Soundness: Vc' . ( #-/- c c') =k (Vs'€ |"c'] . 3sG [c] . IVr s ^ sO ■ 

Theorem 4 (correctness of ppt). Let p he a well-formed TSG-program, let 

els he well-formed input class for p, then 

Completeness: 



Vds G [cZs] . Vsq . 


■ ■ ^n 


. so = s°{p, ds) A (A'Fp^ Ihr Si s*+i) ^ 


3co . . 


■ 


. co = c°(p, els) A (A^Tq^ Ihr a ^ a+i) A (A'VqSjG |"ci]) 


Soundness: 






Vco . . 


• Cn 


. co = c°(p, els) A (A^To^ Ihr Ci ^ c^+l) =h 


3ds G [cZs] . 3 sq . 


■ ■ 


■ so = s°{p, ds) A (A^Tq^ Ihr Si s*+i) A (A'VqSjG |"ci]) 



Proof: Omitted (base case Thm. 2, induction step Thm. 3). □ 

7 Tabulation and Inversion 

Before defining the solution of inverse computation, we define the tabulation of 
a program p for a given input class els in- Tabulation divides input class els in 
into disjoint input classes each of which is associated with a leave (output) in 
the process tree. All input-output pairs are collected in a set TAB{p, clsin)- For 
this we define a transition relation -^tab (Fig. 10) that carries an input class 
and applies to it every contraction k encountered while following a path in the 
process tree. Finally, we define the solution of inverse computation as the set 
AJVS(p, els 

in 5 dout ) ■ 

Definition 20 (tabulation). Let p he a well-formed TSG-program, let clsm he 
a well-formed input class for p. Define tabulation of p on clsin as follows: 

TAB{p,clsin) {{els, e/a) \ {clsin, c°{p, clsin)) -^/ab {cls,{{e,a),r))} . 

Definition 21 (inverse computation). Letp he a well-formed TSG-program, 
let clsin he a well-formed input class for p, and let dout G Dval. Define inverse 
computation of p on clsin and dout as follows: 

Hcif 

ANS{p, clsin, dout) = {{0,r) \ {els, d) G TAB{p, clsm), 0, 0' € CCsub, 

r e Restr, d/9' = dout, clsin/9/r = cls/9' } . 
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Correctness. Proving the correctness of tabulation TAB{p, chin) must consist of 
a soundness and completeness argument. For completeness we must prove that 
for each evaluation |p][di, . . . , d„] = d, there is a input-output pair (cls,d) G 
TAB{p,clsin) such that G [cZs] and d G [d]. For soundness we 

must prove that each (cls,d) G TAB{p, chin) and each [di,...,d„] G [cZs] 
implies |p][di, . . . , d„] = d and d G |"d]. The corresponding argument for set 
ANS{p, chin, dout) is based on the correctness of the tabulation. We shall not 
be concerned with the technical details of the proofs, only with the fact [2] that 
tabulation and inversion are correct wrt the operational semantics. 

Theorem 5 (correctness of TAB). Let p be a well-formed TSG-program, let 
chin be a well-formed input class for p, and let T = TAB(jp, chin), then com- 
pleteness and soundness: 

{ (.dsin,d) I ds in G IpJds^T.^ — d } — 

{(&/6»,d/6») I {{ds,r),d)GT,9GFS{{ds,r)),r/9 = (!)} . 

Theorem 6 (correctness of ANS). Let p be a well-formed TSG-program, let 
chin be a well-formed input class for p, let dout GDval and A = ANS{p, chin, dout), 
then completeness and soundness: 

{ ds in I ds in G IpJds^T^ — dout } — \ ^h in / 9 / r~\ . 

(6»,r )gA 

The most important property of set TABfp, chin) is the perfectness property — 
this allows us to inverse all input-output pairs independently and in any order. 



Theorem 7 (perfectness of TAB). Let p be a well-formed TSG-program, let 
chin be a well-formed input class for p, and let (chi,di) and {cls 2 ,d 2 ) be two 
different input- output pairs from TAB{p, chin), then [cZsi] n \cls 2 ~\ = 0. 



8 Algorithmic Aspects 

In this section we discuss algorithmic aspects related to the Universal Resolving 
Algorithm and presents our Haskell implementation. While Def. 21 specifies the 
solution obtained from the tabulation of the perfect process tree, an algorithm 
for inverse computation must actually traverse the process tree according to 
some algorithmic strategy and extract the solution from the leaves. 

The algorithm is fully implemented in Hugs, a dialect of Haskell, a lazy 
functional language (321 lines of pretty-printed source text).^ The algorithm 
is structured into three separate functions: (1) function ppt that builds a poten- 
tially infinite process tree, (2) function tab that consumes the tree to perform 
the tabulation, and (3) function inv that enumerates set ANS{p, chin, dout)- 
The main function ura which performs inverse computation is defined by 

® Hugs-script available by http://www.botik.ru/AbrGlu/URA/MPC2000 
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ppt : : ProgTSG -> Class -> Tree 
ppt p cls@(ces, r) = evalT c p i 

where (DEFINE f xs _) : _ = p 
env = mkEnv xs ces 
c = ((CALL f xs, env), r) 
i = freeind 0 els 

evalT : : Conf -> ProgTSG -> Freeind -> Tree 

evalT c@(( CALL f es , env), r) p i = NODE c [ (kid, evalT c’ pi) ] 
where DEFINE _ xs t = getDef f p 
env’ = mkEnv xs (es/.env) 
c ’ = ( (t ,env’ ) ,r) 

evalT c@(( IF cond tl t2 , env), r) p i = NODE c (brT++brF) 

where ( (kT,kF) ,bindsT,bindsF, i ’ ) = ccond cond env i 
brT = mkBr tl kT bindsT 
brF = mkBr t2 kF bindsF 
mkBr t k binds = case r’ of 

[CONTRA] -> [] 

-> [(k, evalT c’ p i’)] 
where ((_,env’), r’) = c/.k 

c’ = ((t, env ’+. binds) , r’) 

evalT c@( (e , env) ,r) p i = LEAF c 

ccond :: Cond -> PCenv -> Freeind -> (Split ,PCenv,PCenv, Freeind) 
ccond (EQA? eal ea2) env i = 

let ceal = eal/. env; cea2 = ea2/.env in case (ceal, cea2) of 
(a, b ) I a==b -> ( (kId,kContra) , [],[], i) 

(ATOM _,AT0M _) -> ( (kContra,kId) , [],[], i) 

(XA _, cea ) -> (splitA ceal cea, [],[],!) 

(cea, XA _ ) -> (splitA cea2 cea,[],[],i) 

ccond (CONS? e xh xt xa) env i = 

let ce = e/.env in case ce of 

CONS ceh cet -> ( (kId,kContra) , [xh: =ceh,xt : =cet] , [] , i ) 

ATOM a -> ( (kContra,kId) , [] , [xa:=ce],i ) 

XA _ -> ( (kContra,kId) , [] , [xa:=ce],i ) 

XE _ -> (split, [xh:=cxh,xt :=cxt] , [xa:=cxa] , i ’ ) 

where 

(split,i’) = splitE ce i 
(S[_:->(C0NS exh ext)] ,S [_ : ->cxa] )=split 
Fig. 11. Function ppt for constructing perfect process trees (written in Haskell) 



ura : : ProgTSG -> Class -> Dval -> [(CCsub,Restr)] 
ura p els out = inv (tab (ppt p els) els) els out 

Given source program p, class els and output out, function ura returns a list of 
substitution-restriction pairs (CCsub,Restr) . Due to the lazy evaluation strategy 
of Haskell, the process tree and the tabulation are only developed on demand by 
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type Tab = [(Class, Cexp)] 

tab : : Tree -> Class -> Tab 
tab tree els = tab’ [(els, tree)] 
where tab’ [] = [] 

tab’ ((els, LEAF ( (e ,env) ,_) ) : ets) = (els,e/.env) : (tab’ cts) 
tab’ ((els, NODE _ brs) :ets) = 

tab’ (ets++(map (\(k,tree) -> (els/. k, tree)) brs)) 

inv : : Tab -> Class -> Dval -> [(CCsub,Restr)] 
inv tab els out = eoneat (map ans tab) 
where ans (els_i, ee_i) = 

ease (elash [ee_i] [out]) of 
(False, _) -> [] 

(True, sub’) -> ease els_i’ of 

(_, [CONTRA]) -> [] 

-> [(sub, r)] 

where els_i’ = els_i/.sub’ 

(sub, r) = subClassCntr els els_i’ 

Fig. 12. Functions tab and inv for tabulation and inversion (written in Haskell) 



function ura. The type definitions Class, Dval, CCsub and Restr correspond to the 
domains Class, Dval, CCsub, and Restr; the source program is typed ProgTSG. 
The implementation of the functions ppt, tab, inv is shown in Figs. 11 and 12. 

Function ppt in Fig. 11 implements the trace semantics from Fig. 9 such that 
all applicable rules are fired at the same time. The function makes use of a tree 
structure to record all walks: 

data Tree = LEAF Conf type Branch = (Contr, Tree) 

I NODE Conf [Branch] 

For each rule that applies a branch is added (one branch if the transition is 
deterministic, two branches if the transition is non-deterministic). Each node is 
labeled with the current configuration c, and each branch with the contraction k 
used to split c (the contraction k is needed for tabulation). Function ppt is the 
initial function, function evalT constructs the tree, and function ccond evaluates a 
condition. The reader may notice the format returned by function ccond: a tuple 
that contains the split to be performed on the current configuration, possibly 
updated bindings for the true- and false-branch, and a free index i (used for 
generating fresh variables). Infix operator /. implements substitution /, and 
infix operator +. implements update a[x\ di, . . . ,Xn d„] . 

Auxiliary functions splitA and splitE return the perfect splits for ca- and 
ce- variables, respectively (as defined in Thm. 1, perfect splits): 

splitA :: CAvar -> CAexp -> Split — Thm.l: split 2,3 

splitA exa cea = (S [exa: ->cea] , R[cxa: # : cea] ) 



206 Sergei Abramov and Robert Gliick 



splits :: CAvar -> Freeind -> (Split, Freeind) — Thm.l: split 4 
splitE cxe i = ( (S [cxe : ->(C0NS cxe’h cxe’t)], S [cxe : ->cxa] ) , i’) 
where cxe’h = newCEvar(i); cxa = newCAvar (i+2) 

cxe’t = newCEvar (i+1) ; i’ = i+3 

Function tab in Fig. 12 consumes the process tree produced by ppt using a 
breadth-first strategy^ in order to ensure that all leaves on finite branches will 
eventually be visited. This is important because a depth-first strategy may ‘fall’ 
into an infinite branch, never visiting other branches. 

Function inv in Fig. 12 enumerates the set ANS{p, clsm, dout) according 
to Def. 21. Two auxiliary functions clash and subClassCntr are used. Given 
dsi,ds 2 G Cexps, the auxiliary function clash returns (True, 9) if a substitu- 
tion 9 G CCsub exists such that dsi/9 = ds 2 and dom{9) = var(dsi); otherwise 
(False, []). The requirement for the domain of 9 ensures that no redundant 
bindings are added and that, if a solution exists, we produce a unique 9. 

Given cZs, els' € Glass where els' can be obtained from els by several con- 
tractions, the auxiliary function subClassCntr returns (.9,r) where 9 G GGsub, 
r G Restr such that els' = {cls/9)/r and dom(9) = var(cls) . 

Termination. Of course, inverse computation is undecidable, so an algorithm 
cannot be sound, complete, and terminating. Our algorithm is sound and com- 
plete, but not always terminating. Each solution, if it exists, is computed in finite 
time due to the breadth-first strategy. The algorithm does not always terminate 
because the search for solutions in a process tree may continue infinitely (even 
though all elements of the solution were found) . The algorithm terminates if all 
branches in a process tree are finite. 

9 Experiments and Results 

This section illustrates the Universal Resolving Algorithm by means of exam- 
ples. The first example illustrates inverse computation of a pattern matcher, the 
second example demonstrates the inverse interpretation of While-programs.® 



Pattern matching. We performed the two inversion tasks from Sect. 1 using 
a naive pattern matcher written in TSG (Fig. 13). 

~ Task 1: Find the set of strings pattern which are substrings of "ABC". 
To perform this task we leave input pattern unknown (Wi), set input 
string = "ABC" and the desired output to ’SUCCESS. 

— Task 2: Find the set of strings pattern which are not substrings of "Kkk". 
To perform this task we use a setting similar to Task 1 (pattern = Xei, 
string = "AAA"), but the desired output is set to ’FAILURE. 

^ The breadth-first strategy is implemented in the last line of function tab by ap- 
pending the list of next-level-nodes produced by map to the end of list cts. 

® Run times given for Hugs 98, PC/Intel Pentium MMX-233MHz, MS Windows 95. 
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match = 




[(DEFINE "Match" [p,s] 


(DEFINE "NextPos" [p,s] 


(CALL"CheckPos" [p,s,p,s] ) ), 


(IF (CONS? s sh St a) 


(DEFINE "CheckPos" [p,s,pp,ss] 


(CALL "Match" [p,st]) 


(IF (CONS? p ph pt a) 


’FAILURE ) ) ] 


(IF (CONS? ph _ _ a’ph) 




’ ERROR : Atom_expected 




(IF (CONS? s sh St a) 




(IF (CONS? sh _ _ a’sh) 




’ ERROR : Atom_expected 




(IF (EQA? a’ph a’sh) 




(CALL "CheckPos" [pt , St ,pp, ss] ) 


(CALL "NextPos" [pp,ss]) ) 


) 


’FAILURE ) ) 




’ SUCCESS ) ) , 




Fig. 13. Naive pattern matcher written 


in concrete TSG syntax 



Figure 14 shows the results of applying URA to the matcher. The answer for 
Taskl is a finite representation of all possible substrings of string "ABC", Fig. 14(i). 
The answer for Task 2 is a finite representation of all strings which are not sub- 
strings of "AAA", Fig. 14(ii). URA terminates after 0.5 seconds (Task 1, Task 2). 



Interpreter inversion. As proven in [3,4], inverse computation can be per- 
formed in a programming language N given a standard interpreter intN for N 
written in L, and an inverse interpreter for L. The result obtained by inverse 
computation of Ifs interpreter is a solution for inverse computation in N. The 
theorem guarantees that the solution is correct for all A-program regardless of 
intlSPs operational properties. Since TSG is a universal programming language 
we can, in principle, perform inverse computation in any programming language. 

According to this result, we should now be able to apply our algorithm to 
programs written in languages other than TSG. To put this theorem to a practi- 
cal trial, we implemented an interpreter for an imperative language, called MP, 
in TSG. MP [27] is a small imperative language with assignments (<==), condi- 
tionals (oIF) and loops (oWHILE). An MP-program operates over a store consist- 
ing of parameters and local variables. The semantics is conventional Pascal-style 
semantics. The MP-interpreter has 309 lines of pretty-printed source text, 30 
functions in TSG, and is the largest TSG-program we implemented. 

To compare the results with inverse computation in TSG, we rewrote the 
naive pattern matcher in MP. Figure 14 shows the results for the inversion of 
the MP-matcher. The answer for Task 1 is a finite representation of all possible 
substrings of string "ABC", Fig. 14(iii). The answer for Task 2 is a finite rep- 
resentation of all strings which are not substrings of "AAA", Fig. 14(iv). URA 
terminates after 36 sec (Task 1) and after 34 sec (Task 2). 
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(i) ura [ match. [ ([Xei, str"ABC"] , [] ) , ’SUCCESS ]] ^ 

[ ([Xei^Xa4], □), 

([Xeii-^(C0NS ’A Aaio)]. □ ) , 

( [Aei h^CCONS ’A (CONS ’B Xaw))], [] ) , 
([Xeii-^(C0NS ’B Aaio)]. □ ) , 

( [Aei h^CCONS ’A (CONS ’B (CONS ’C Xa 22 )))l , □ ) . 
( [Aei i-^(C0NS ’B (CONS ’C Xaw))], □ ) . 
([Xeii-^(C0NS ’C Xaio)]. [] ) ] 



— str" " 

— str"A" 

— str"AB" 
—str "B" 
— str"ABC" 
—str "BC" 
—str "C" 



(ii) ura [ match, [ ([Xei, str"AAA"] , [] ) , ’FAILURE ] ] 

[ ( [Aei i-^(C0NS ’A (CONS ’A (CONS ’A (CONS Xo 25 Xe 2 i))))l , D) , 
([Aeii-^(C0NS Xa 7 Xea)] , [Xa? ’ A]) , 

( [Aei i-^(C0NS ’A (CONS ’A (CONS Xaw Xew)))] , [Xaw ’ A]) , 

( [Aei h^(C0NS ’A (CONS Xaw Xeg))] , [Xaw ’ A] ) ] 

(iii) ura [ intMP, [ [matchMP, ( [Xei , str "ABC"] ,[])],’ SUCCESS ]] ^ 

[ ( [Aei i-^Aa4] , □), —str"" 

( [Aei i-^(C0NS ’A Xaw)], □), — str"A" 

( [Aei h^(C0NS ’A (CONS ’B Xaw))], □ ) . — str"AB" 

( [Aei h^(C0NS ’B Xaw)], □), —str "B" 

( [Aei h^(C 0NS ’A (CONS ’B (CONS ’C Xa 22 )))]. □ ) . — str"ABC" 

( [Aei i-^(C0NS ’B (CONS ’C Xaw))], □ ) . —str "BC" 

( [Aei i-^(C0NS ’C Xaio)]. □) ] —str "C" 

(iv) ura [ intMP, [[matchMP, ([Xei, str "AAA"] ,[])], ’FAILURE] ] ^ 

[ ([Aei h^(C 0NS ’A (CONS ’A (CONS ’A (CONS Ae 2 o Ae 2 i) )))],[]) , 
([Aei i-^(C0NS Xa 7 Xea)] , [Xa 7 ’ A]) , 

( [Aei i-^(C0NS ’A (CONS ’A (CONS Xaw Xew)))] , [Xaw ’ A]) , 

( [Aei i-^(C0NS ’A (CONS Xaw Xeg))] , [Xaw ’A]) 1 

Fig. 14. Inverse computation of pattern matcher (i, ii) and interpreter (iii, iv) 



Inverse computation in MP (implemented by ura and intMP) produces results 
very similar® to inverse computation in TSG (implemented directly by ura) . This 
is noteworthy because inverse computation in MP is done through a standard 
interpreter for MP (and not by an inverse interpreter for MP). It demonstrates 
that inverse computation can be ported successfully, here, from a functional 
language to an imperative language. Inverse computation in MP takes longer 
than in TSG due to the additional interpretive overhead (about 70 times). 

In earlier work [4], inverse computation was ported from TSG to a small 
assembler-like programming language (called Norma). The only other experi- 
mental work we are aware of that ported inverse computation, inverses impera- 
tive programs by treating their relational semantics as logic program [26]. Our 



The results differ slightly (Fig. 14: compare (ii) line 2 and (iv) line 2) due to small 
differences in the implementation of the source programs. 
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experiment gives further practical evidence for the idea of porting inverse com- 
putation from one language to another. 

10 Related Work 

The first work on program inversion appears to be [22], suggesting a generate 
and test approach for Turing machines; this will correctly find an inverse when 
it exists, but is computationally infeasible. Several efforts have gone into imper- 
ative programs [16,7,17,6] but use non-automatic (sometimes heuristic) methods 
for deriving the inverse program. For example, the technique suggested in [7] 
provides for inverting programs symbolically, but requires that the programmer 
provide inductive assertions on conditionals and loop statements. 

Few papers have been devoted to inversion of functional programs [5,9,18,20] 
[21,25] in a similar manner, sometimes automatically. The work in functional 
languages is usually on program inversion. An automatic system for synthesizing 
recursive programs from first-order functional programs is InvX [20] . The inverse 
of functions has been paid attention to, at least conceptually, in program analysis 
and program verification {e.g., [8,24]). 

An early result [28] for inverse computation in a functional language was ob- 
tained in 1972 by a unification-based transformation technique called driving [29] 
which was used to perform subtraction by inverse computation of binary addi- 
tion. Later, universal resolving algorithms were implemented using methods from 
supercompilation [29] for first-order functional programs by combining them with 
a mechanical extraction of answers (c/. [1,25]). 

We know of two techniques for inverse computation in functional languages: 
the universal resolving algorithm (see [1,4]) and walk grammars for inverse inter- 
pretation [30,23]. The universal resolving algorithm in this paper uses methods 
from supercompilation [29], in particular driving, and is based on perfect process 
trees [12]. Connections between inverse computation and logic programming are 
discussed in [1,4]; partial deduction and driving were formally related in [14]. An 
abstract framework for describing partial evaluation and supercompilation is [19] . 
A comprehensive bibliography on supercompilation can be found in [15]. 

To conclude, there exists only a small number of papers addressing inverse 
computation in the context of functional languages. With the exception of [26,4], 
we know of no paper addressing inverse computation in imperative languages. 

11 Conclusion and Future Work 

We presented an algorithm for inverse computation in a first-order functional 
language based on the notion of a perfect process tree, discussed the general 
organization and structure of inverse computation, stated the main correctness 
results, and illustrated our Haskell implementation with several examples. 

Among others, a motivation for our work was the thesis [13] that program 
inversion is one of the three fundamental operations on programs (beside pro- 
gram specialization and program composition). We believe that to achieve full 
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generality of program manipulation, ultimately all three operations have to be 
mastered. So far, progress has been achieved mostly on program specialization. 

For future work it is desirable, though not difficult, to extend our algorithm to 
user-defined constructor domains. This requires an extension of the set represen- 
tation in Sect. 2 and an extension of the source language {e.g., case-expressions). 
In this paper we focused on a rigorous development of the principles and foun- 
dations of inverse computation and used S-expressions familiar from Lisp. 

In general, cutting all infeasible branches from a process tree cannot be 
guaranteed, in particular, when the underlying logic of the set representation 
is undecidable for certain logic formulas (or too time consuming to prove). For 
example, this is the case when using a tree developer based on generalized partial 
computation [10]. In this case, the solution of inverse computation may contain 
elements which represent empty sets of input (the correctness of the solution can 
be preserved). The set representation we used expresses structural properties of 
values that can always be resolved. Perfect splits are essential to guarantee the 
correctness of the solution, cutting infeasible branches improves termination and 
efficiency of the algorithm. 

The question of a more efficient implementation is also left for future work. 
Our algorithm is fully implemented in Haskell and serves our experimental pur- 
poses quite well. In particular, Haskell’s lazy evaluation strategy allowed us to 
use a modular approach very close to the theoretical definition of the algorithm 
(where the development of perfect process trees and the inversion of the tabula- 
tion are conveniently separated) . The design of a more efficient algorithm would 
require to merge these steps. Compilation techniques and strategies developed 
for logic programming may be beneficial for a more practical implementation. 

Finally, the relation to narrowing used in logic- functional programming and 
term rewriting should be studied more formally (reference [14] relates driving 
and partial deduction). 
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Abstract. This paper presents a modular and extensible style of lan- 
guage specification based on metacomputations. This style uses two mon- 
ads to factor the static and dynamic parts of the specification, thereby 
staging the specihcation and achieving strong binding-time separation. 
Because metacomputations are defined in terms of monads, they can 
be constructed modularly and extensibly using monad transformers. A 
number of language constructs are specified: expressions, control-flow, 
imperative features, and block structure. Metacomputation-style speci- 
fication lends itself to semantics-directed compilation, which we demon- 
strate by creating a modular compiler for a block-structured, imperative 
while language. 

Keywords: Compilers, Partial Evaluation, Semantics-Based Compila- 
tion, Programming Language Semantics, Monads, Monad Transformers, 
Pass Separation. 



1 Introduction 

Metacomputations — computations that produce computations — arise naturally 
in the compilation of programs. Figure 1 illustrates this idea. The source lan- 
guage program s is taken as input by the compiler, which produces a target 
language program t. So, compiling s produces another computation — namely, 
the computation of t. Observe that there are two entirely distinct notions of 
computation here: the compilation of s and the execution of t. The reader will 
recognize this distinction as the classic separation of static from dynamic. Thus, 
staging is an instance of metacomputation. 

The main contributions of this paper are: (1) Compiler architecture 
based on metacomputations: Metacomputation-based compiler architecture 
yields substantially simpler language definitions than in [8], while still retaining 
its modular “mix and match” approach to compiler construction. Combining the 
metacomputation-based “reusable compiler building blocks” is also much simpler 
than combining those in [8] (as is proving their correctness). (2) A modular and 
extensible method of staging denotational specifications based on metacomputa- 
tions: A style of language specification based on metacomputation is proposed 
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s:Source 



compile Target Interpreter 




Fig. 1. Handwritten compiler as metacomputation 



in which the static and dynamic parts of a language specification are factored 
into distinct monads[7,13,16,24]. (3) Direct-style specifications: instead of writ- 
ing all specifications in continuation-passing style, here we write in direct style, 
invoking the CPS monad transformer only when needed. This naturally simpli- 
fies many of the equations, and although less essential than (1) and (2), it also 
helps to make the approach more practical. 
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Fig. 2. Modular Compilers: Existing compiler building blocks combine to make 
new compiler 
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We believe this style of language specification may have many uses, but in 
this paper we concentrate on one: modular compilation. Modular compilers are 
compilers built from building blocks that represent language features rather than 
compilation phases, as illustrated in Figure 2. Espinosa [7] and Liang, Hudak, 
& Jones [13] showed how to construct modular interpreters using the notion of 
monads [7,13,16,24] — or, more precisely, monad transformers. 

The current authors built on those ideas to produce modular compilers in [8] . 
However, there the notion of staging, though conceptually at the heart of the 
approach, was not explicit in the compiler building blocks we constructed. As in 
traditional monadic semantics, the monadic structure was useful in creating the 
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domains, but those domains, once constructed, were “monolithic;” that is, they 
gave no indication of which parts were for dynamic aspects of the computation 
and which for static aspects. The result was awkwardness in communicating 
between these aspects of the domain, which meant that “gluing together” com- 
piler blocks was sometimes delicate. However, metacomputation-based compiler 
architecture completely alleviates this awkwardness, so that combining compiler 
blocks is simply a matter of applying the appropriate monad transformers. 

Indeed, metacomputation is purposely avoided in [7,13,8]. A key aspect of that 
work is that monad transformers are used to create the single monad used to 
interpret or compile the language. The problem that inspired it was that monads 
don’t compose nicely. Given monads M and M', the composed monad M o 
M' — corresponding to an M-computation that produces an M' computation 
— usually does not produce the “right” monolithic domain. However, there 
may exist monad transformers Tm and Tm' such that Tm Id = M and Tm' Id 
= M', where {Tm oTm')\^ does give the “right” domain. The difference between 
composing monads and composing monad transformers is what makes these 
approaches work — monad transformers are a way to avoid metacomputation. 

In this paper, we show that, for some purposes, metacomputation may be 
exactly what one wants: Defining a compiler block via the metacomputation of 
two monads gives an effective representation of staging. We are not advocating 
abandoning monad transformers: the two monads can be constructed using them, 
with the attendant advantages of that approach. We are simply saying that 
having two monads — what might be called the static and dynamic monads — 
and composing them seems to give the “right” domain for modular compilation. 

The next section explains the advantages for modular compilation of meta- 
computation-based language specification over the monolithic style. Section 3 
reviews the most relevant related work. In Section 4, we review the theory of 
monads and monad transformers and their use in language specification. Sec- 
tion 5 presents a case study in metacomputation-style language specification; its 
subsections present metacomputation-style specifications for expressions, control 
flow, block structure, and imperative features, respectively. Section 6 shows how 
to combine these compiler building blocks into a compiler for the combined lan- 
guage, and presents a compiler and an example compilation. Section 7 discusses 
the impact of metacomputation-based specification on compiler correctness. Fi- 
nally, Section 8 summarizes this work and outlines future research. 



2 Why Metacomputations? 

In this section, we will describe at a high level why two monads are better 
than one for modular compilation. Using metacomputations instead of a single 
monolithic monad simplifies the use of the “code store” (defined below) in the 
specification of reusable compiler building blocks. 

In [8], we borrowed a technique from denotational semantics [23] for modeling 
jumps, namely storing command continuations in a “code store” and denoting 
“jump L” as “execute the continuation at label L in the code store.” Viewing 
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command continuations as machine code is a common technique in semantics- 
directed compilation [25,22]. Because our language specifications were in monadic 
style, it was a simple matter to add label generator and code store states to the 
underlying monad. Indeed, the primary use for monads in functional program- 
ming seems to be that of adding state- like features to purely functional languages 
and programs [24,20], and the fact that we structured our monads in [8] with 
monad transformers made adding the new states simple. 

The use of a code store is integral to the modular compilation technique de- 
scribed in [8]. We use it to compile control-flow and procedures, and the presence 
of the code store in our language specifications allowed us to make substantial 
improvements over Reynolds[22] (e.g., avoiding infinite programs through jumps 
and labels). Yet the mixing of static with dynamic data into one “monolithic” 
monad causes a number of problems with using the code store. Consider the 
program “if h then (if h' then c)” . Compiling the outer “if’ with initial con- 
tinuation halt and label 0 will result in the continuation “|if 6' then c]; halt” 
being stored at label 0 and the label counter being incremented. The problem 
here is that trying to compile this continuation via partial evaluation will fail. 
Why? Because having been stored rather than executed, it will not have access 
to the next label 1. Instead, the partial evaluator will try to increment a (dy- 
namic) variable rather than an actual (static) integer, and this will cause an 
error (e.g., a partial evaluator can evaluate “1-1-1” but not “x-fl”). In [8], the 
monolithic style specifications forced all static data to be explicitly passed to 
stored command continuations, although this was at the expense of modularity. 
In fact to compile if-then-else, the snapback operator had to be used. These 
complications also make reasoning about compilers constructed in [8] difficult. 
We shall demonstrate in Section 5 that using metacomputations results in vastly 
simpler compiler specifications than in [8] and that this naturally makes them 
easier to reason about. 

3 Related Work 

Espinosa [7] and Hudak, Liang, and Jones [III] use monad transformers to create 
modular, extensible interpreters. Liang [12,14] addresses the question of whether 
compilers can be developed similarly, but since he does not compile to machine 
language, many of the issues we confront — especially staging — do not arise. 

A syntactic form of metacomputation can be found in the two-level A-calculus 
of Nielson [19]. Two-level A-calculus contains two distinct A-calculi — representing 
the static and dynamic levels. Expressions of mixed level, then, have strongly 
separated binding times by definition. Nielson[18] applies two-level A-calculus to 
code generation for a typed A-calculus, and Nielson[19] presents an algorithm for 
static analysis of a typed A-calculus which converts one-level specifications into 
two-level specifications. Mogensen[15] generalizes this algorithm to handle vari- 
ables of mixed binding times. The present work offers a semantic alternative to 
the two-level A-calculus. We formalize distinct levels (in the sense of Nielson [19]) 
as distinct monads, and the resulting specifications have all of the traditional 
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advantages of monadic specifications (reusability, extensibility, and modularity). 
While our binding time analysis is not automatic as in [19,15], we consider a far 
wider range of programming language features than they do. 

Danvy and Vestergaard [5] show how to produce code that “looks like” ma- 
chine language, by expressing the source language semantics in terms of machine 
language-like combinators (e.g., “popblock”, “push”). When the interpreter is 
closed over these combinators, partial evaluation of this closed term with respect 
to a program produces a completely dynamic term, composed of a sequence of 
combinators, looking very much like machine language. This approach is key to 
making the monadic structure useful for compilation. 

Reynolds’ [22] demonstration of how to produce efficient code in a compiler 
derived from the functor category semantics of an Algol-like language was an 
original inspiration for this study. Our approach to compilation improves on 
Reynolds’s in two ways: it is monad-structured — that is, built from interchange- 
able parts — and it includes jumps and labels where Reynolds simply allowed 
code duplication and infinite programs. 

4 Monads and Monad Transformers 

In this section, we review the theory of monads [16,24] and monad transform- 
ers [7,13]. Readers familiar with these topics may skip the section. 

A monad is a type constructor M together with a pair of functions (obeying 
certain algebraic laws that we omit here): 

*ivi : Mr (r ^ Mr') ^ Mr' 
unitM : T ^ Mr 

A value of type Mr is called a t - computation, the idea being that it yields a 
value of type r while also performing some other computation. The *m opera- 
tion generalizes function application in that it determines how the computations 
associated with monadic values are combined. unitM defines how a r value can 
be regarded as a r-computation; it is usually a trivial computation. 

To see how monads are used, suppose we wish to define a language of integer 
expressions containing constants and addition. The standard definition might 
be: 

[ 61 + 62 ] = [ 61 ] + [ 62 ] 

where |— | : Expression — > int. However, this definition is inflexible; if expressions 
needed to look at a store, or could generate errors, or had some other feature 
not planned on, the equation would need to be changed. 

Monads can provide this needed flexibility. To start, we rephrase the defini- 
tion of [— ] in monadic form (using infix bind *, as is traditional) so that |— | 
has type Expression — > M int: 

[ 61 + 62 ] = [ei] * (Ax. [ 62 ] * (Ai/.add(x,y))) 

We must define an operation add of type int x int M int. 
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Identity Monad Id: 



Id r = r 

unitid X = X 
* *id } = f X 



CPS Monad Transformer Tcps'- 

M V = llcps o-TiS M r = 

(r ^ M ans) — > M ans 

unitvi' X = \k. k X 

X *M' / = Ak. x{Xa.f a k) 

2; = *M 

callcc : ((a — > Mb) — > Ma) ^ Ma 
callcc / = AK./(Aa.A_.K a) k 



Environment Monad Transformer 

mV = llinv Env M r = Env — > Mr 
unitivi' X = \p : Env. unitM x 
X *M' f = Xp : Env. {x p) *m {Xa.f a p) 

X = Xp \ EuV . X 

rdEnv : W\' Env = Xp : Env. unitMP 
inEnv(p : Env, x : MV) = A_. {x p) : MV 
State Monad Transformer %t: 

mV = %i store M r = store ^ M(r x store) 
unitM' X = Xfj : store. unitM (x, a) 

X *M' / = '^cro : store. {xoo) *m (A(a, oi).faai) 
X = Xa.x *M Xy. unitM (y,fr) 
update(zA : store — » store) = Acr.unitM(», 2i(r) 
getStore = Act. unitM (ct, <x) 



Fig. 3. The Identity Monad, and Environment, CPS, and State Monad Trans- 
formers 



The beauty of the monadic form is that the meaning of |— ] can be reinter- 
preted in a variety of monads. Monadic semantics separate the description of a 
language from its denotation. In this sense, it is similar to action semantics)!!] 
and high-level semantics[ll]. 

The simplest monad is the identity monad, shown in Figure 3. Given the 
identity monad, we can define add as ordinary addition. |— ] would have type 
Expression int. 

Perhaps the best known monad is the state monad, which represents the 
notion of a computation as something that modifies a store: 

Mstx = Sto T X Sto 
X -k f = Act. let {x',a') = xa in fx'a' 
unitri = Xa.{v,a) 

add (x,y) = Act. (x y, a) 

The * operation handles the bookkeeping of “threading” the store through the 
computation. Now, |— ] has type Expression Sto int x Sto. This might 
be an appropriate meaning for addition in an imperative language. To define 
operations that actually have side effects, we can define a function: 

updateSto : [Sto — > Sto) — > Mstvoid 
: / 1 -^ Act.(*,/ct) 
getSto : MstSto 
: Act.(ct, ct) 
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updateSto applies a function to the store and returns a useless value (we assume 
a degenerate type void having a single element, which we denote •). getSto 
returns the store. 

Now, suppose a computation can cause side effects on two separate stores. 
One could define a new “double-state” monad M 2 St: 



M2StT = Sto X Sto — > T X Sto X Sto 



that would thread the two states through the computation, with separate 
updateSto and getSto operations for each copy of Sto. One might expect to get 
M 2 StT by applying the ordinary state monad twice. Unfortunately, Mst(M 5 (r) 
and M 2 Stt are very different types. This points to a difficulty with monads: they 
do not compose in this simple manner. 

The key contribution of the work [7,13] on monad transformers is to solve 
this composition problem. When applied to a monad M, a monad transformer T 
creates a new monad M'. For example, the state monad transformer, Tst store, 
is shown in Figure 3. (Here, the store is a type argument, which can be replaced 
by any value which is to be “threaded” through the computation.) Note that 
Tst Sto Id is identical to the state monad, but here we get a useful notion of 
composition: Tst Sto {Tst Sto \d) is equivalent to the two-state monad 1^25*1". 
The state monad transformer also provides updateSto and getSto operations 
appropriate to the newly-created monad. When composing Tst Sto with itself, 
as above, the operations on the “inner” state need to be lifted through the outer 
state monad; this is the main technical issue in [7,13]. 

In our work in [8], we found it convenient to factor the state monad into 
two parts: the state proper and the address allocator. This was really a “staging 
transformation,” with the state monad representing dynamic computation and 
the address allocator static computation, but, as mentioned earlier, it led to 
significant complications. In the current paper, we are separating these parts 
more completely, by viewing compilation as metacomputation. 



4.1 A Semantics for Metacomputation 



We can formalize this notion of metacomputation using monads[7,13,16,24] and 
use the resulting framework as a basis for staging computations. Given a monad 
M, the computations of type a is the type M a. So given two monads M and N, 
the metacomputations of type a is the type M(N a), because the M-computation 
produces as a value a N-computation. This definition is not superfluous; as we 
have noted, M o N is not generally a monad, so metacomputations are generally 
a different notion altogether from computations. 
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5 A Case Study in Metacomputation-Based Compiler 
Architecture: Modular Compilation for the While 
Language 

In this section, we present several compiler building blocks. In section 6, they 
will be combined to create a compiler. For the first two of these blocks, we 
also give monolithic versions, drawn from [8], to illustrate why metacompu- 
tation is helpful. Of particular importance to the present work. Section 5.2 
presents the reusable compiler building block for control flow, which demon- 
strates how metacomputation-based compiler architecture solves the difficulties 
with the monolithic approach we outlined in Section 2. 



5.1 Integer Expressions Compiler Building Block 



Standard: | 
Dynam = Id 



Implementation-orient ed/Monolitliic: 

Dynam = AMr Sto Id) 

Addr = int, Sto — Addr — > int 
Thread(j : int, a : Addr) = 
updateSto[a i— > i] *b A_.rdloc(a) 
rdloc(a) = getSto Xa-uidtoicra) 



[— t] : Dynam(mt) = 

|t] Xi. 

unitp (— i) 

AIono[— t] : Dynam(jnt) = 

Ato?io|t] Xi. 
rdAddr Xa. 
inAddr (o -I- 1) 

(Thread(j, a) *0 Ar).unit_D (— d)) 



Metacomputation: 



Dynam = Sto Id 
Static = liinv Addr Id 



C[— t] : Static(Dynam(mt)) = 
rdAddr *5 Xa. 
inAddr (a -I- 1) 

(CW *s 



units 



Xifu : Dynam(mt). 

( 4>t Xi. 
Thread(i,a) Ttp Xv. 
unitD(-r) 



Fig. 4. Negation, 3 ways 



Consider the standard monadic-style specification of negation[7,13,24] dis- 
played in Figure 4. To use this as a compiler specification for negation, we need 
to make a more implementation-oriented version, which might be defined infor- 
mally as: 

|— = |f] *_D At. “Store t at a and return contents of a” At;. unit £> (— t;) 

Let us assume that this is written in terms of a monad Dynam with bind and 
unit operations -ko and unite. Observe that this implementation-oriented 
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definition calculates the same value as the standard definition, but it stores the 
intermediate value i as well. But where do addresses and storage come from? 
In [8], we added them to the Dynam monad using monad transformers [7, 13] as 
in the “Implementation-oriented” specification in Figure 4. In that definition, 
rdAddr reads the current top of stack address a, inAddr increments the top of 
stack, and Thread stores i at a. The monad (Dynam) is used to construct the 
domain containing both static and dynamic data. 



C|ei +62] : Static(Dynam mt) = 
rdAddr Xa. 



C[ei| *s 
inAddr (a -|- 2) 
C[e2] *s X4> 62 



units 



( <j}ei *D Xi : int. \ 

'^D Xj int. 

Thread(i, a) *_d Xvi. 
Thread(j, (a -I- 1)) *d Xv 2 - 
\ unite (U1+U2) / 



Fig. 5. Specification for Addition 



In the “metacomputation” -style specification, we use two monads, Static, 
to encapsulate the static data, and Dynam to encapsulate the dynamic data. 
The meaning of the phrase is a metacomputation — the Static monad produces 
a computation of the Dynam monad. Clear separation of binding times is thus 
achieved. (In our examples, we have set the dynamic parts of the computation 
in a box for emphasis.) 

Figure 5 displays the specification for addition, which is similar to negation. 
Multiplication and subtraction are defined analogously. 



5.2 Control-Flow Compiler Building Block 



We now present an example where separating binding times in specifications 
with metacomputations has a very significant advantage over the monolithic 
approach. Consider the three definitions of the conditional if-then statement 
in Figure 6. The first is a dual continuation “control-flow” semantics, found 
commonly in compilers [2]. If B is true, then the first continuation, |c] k, is 
executed, otherwise c is skipped and just n is executed. A more implementation- 
oriented (informal) specification might be: 
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|if b then c] = 

[&1 XB. 

“get two new labels Lc, *_d X{Lc,Lf^). 

callcc (A k. 

“store K at then (|c] (“jump to i^”)) at Lc” *_d A_. 

S(“jump to Lc”, “jump to L^”)) 

To formalize this specification, we use a technique from denotational seman- 
tics for modeling jumps. We introduce a continuation store, Code, and a label 
state Label. A jump to label L simply invokes the continuation stored at L. The 
second definition in Figure 6 presents an implementation-oriented specification 
of if-then in monolithic style (that is, where Code and Label are both added to 
Dynam). Again, this represents our approach in [8]. 



Control-Flow: | 

Dynam = ^ps void Id 
Bool = Va.a X a — > a 



Implementation-oriented /Monolithic: 



Dynam = TcPS void(' 5 t Label Code Id)) 
Label = int, Code = void — > Dynam void 
jniiipL = getCode (AH : Code.IlL) 
newlabel : Dynam(La6el) = 
getLabel *d A1 : Label. 
updateLabel[L L + 1 ] *_□ A_. 
unit_D(l) 



|if b then c] : Dynam(void) = 

[61 \B : Bool. 
callcc (A k. 

B([c] *D k , k )) 

Atonopf b then c] : Dynam(void) = 

Atono[6] XB : Bool. 
newlabel XL^. 

newlabel XL^. 

callcc (Xk. 

newSegment(i/Ki k) A_. 

newSegment(Lc 5 Adono[c] (jnmpLK))*D 
B(jrmipLc,j™ipLK)) 



I Metacomputation: | 

Dynam = ^ps void (^t Code Id), Static = Label Id 
IfThen : Dynam(5oo/)xDynam(void)xLo6e/xLa6e/^Dynam(void) 
lfThen((/_B,i)!ic,Lc,LK) = 

(jw '^D A-B : Bool. 
callcc {\k. 

updateCode[L« k] A- 

updateCode[Lc I— > (/>c (jumpL^)] ★£> A_. 

B(jimipLc, jrrnipiK)) 



C[if b then c] : Static(Dynam void) = 

*s Mb- 
*s Me- 

newlabel *5 \Lc- 
newlabel ^5 XL^-. 

units (lfThen(</i3,</c,Lc,L„)) 



Fig. 6. if-then: 3 ways 



One very subtle problem remains: what is “newSegment”? One’s first impulse 
is to define it as a simple update to the Code store (i.e., updateCode[LK i— > 
Ac]), but here is where the monolithic approach greatly complicates matters. 
Because the monolithic specification mixes static and dynamic computation, the 
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continuation k may contain both kinds of computation. But because it is stored 
and not executed, k will not have access to the current label count and any other 
static data necessary for proper staging. Therefore, newSegment must explicitly 
pass the current label count and any other static intermediate data structure to 
the continuation it stores^. 



C|ei < 62 ! : Static(Dynam BooZ) = 
rdAddr \a. 



C[ei] *s X4>ei- 
inAddr (a -|- 2) 
C[e 2 ] *s X4> 62 



units 



/ <^ei *D Xi : int. 

4^g2 '^d Xj : int. 

Thread(i, a) *_d Aui. 
Thread(j, (o -I- 1)) *d Xv 2 - 
. I X{kt,Kf). 

\ ((„, <„ 2 ) ^ 



C[while b do c] : Static(Dynam void) = 

C[&| *s A0S. 

C[c] *s A0.. 
newlabel *s XLtest- 
newlabel *s XLc- 
newlabel *s AL„. 

/ callcc Xk. \ 

<I>B XJ3 : Bool. 
updateCode [Z/K 1 — > k] *_d A_. 
updateCode[Lc 0c -kn (jump Ltest)] 
updateCode [Ltest 1 -^ 0s *D AB.((B(jump Lc, jump L„)»)] *d 
y jumpLtest / 

Fig. 7. Specification for < and while 



The last specification in Figure 6 defines if-then as a metacomputation and 
is much simpler than the monolithic-style specification. Observe that Dynam 
does not include the Label store, and so the continuation k now includes only 
dynamic computations. Therefore, there is no need to pass in the label count to 
K, and so, k may simply be stored in Code. This is a central advantage of the 
metacomputation-based specification: because of the separation of static 
and dynamic data into two monads, the complications outlined in Section 2 



^ A full description of newSegment is found in [ 8 ]. 
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associated with storing command continuations in [8] (e.g., explicitly passing 
static data and use of a snapback operator) are completely unnecessary. 

Figure 7 contains the specifications for < and while, which are very similar 
to the specifications of addition and if-then, respectively, that we have seen 
already. 



5.3 Block Structure Compiler Building Block 



Dynam = Id 

Static = Tinv Env (Tinv Addr Id) 
set a = An.updateSto(a v) 
geto = getSto *D Acr.unitD(o'a) 



C[new a; in c] : Static(Dynam void) = 
rdAddr *s Xa. 
inAddr (a + 1) 
rdEnv *s Xp. 

inEnv {p[x i— >■ units (set a, get a)]) C[c] 



Fig. 8. Compiler Building Block for Block Structure 



The block structure language includes new x in c, which declares a new 
program variable x in c. The compiler building block for this language appears in 
Figure 9. The static part of this specification allocates a free stack location a, and 
the program variable x is bound to an accepter-expresser pair [21] in the current 
environment p. In an accepter-expresser pair (acc, exp), acc accepts an integer 
value and sets the value of its variable to the value, and the expresser exp simply 
returns the current value of the variable, set and get set and return the contents 
of location a, respectively, c is then compiled in the updated environment and 
larger stack (a + 1). 



5.4 Imperative Features Compiler Building Block 



Dynam = %t Sto Id, Static = Tinv Env Id 
C|ci;c 2 ] : Static(Dynam void) = 

C[cil *s A<^ Cl ■ 

C[c 2 l *s Afli C2 ■ 



units 



(0C1 A — 0C2 ) 



C[x:=t] : Static(Dynam void) = 
rdEnv *s Xp. 

(px) *s X{acc,-}. 

C[tj *s A0t. 



units 



{4>t *D Xi ■. int.{acci)) 



Fig. 9. Compiler Building Block for Imperative Features 



The simple imperative language includes assignment (:=) and sequencing (;). 
The compiler building block for this language appears in Figure 9. For sequenc- 
ing, the static part of the specification compiles c\ and C 2 in succession, while the 
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dynamic (boxed) part runs them in succession. For assignment, the static part 
of the specification retrieves the accepter [21] acc for program variable x from 
the current environment p and compiles t, while the dynamic part calculates the 
value of t and passes it to acc. 

6 Combining Compiler Building Blocks 



Block Structure Control-flow Block Structure -i- Control-flow 




Figure 10 illustrates the process of combining the compiler building blocks 
for the block structure and control- flow languages. It is important to emphasize 
that this is much simpler than in [8] , in that there is no explicit passing of static 
data needed. The process is nothing more than applying the appropriate monad 
transformers to create the Static and Dynam monads for the combined language. 
Recall that for the block structure language: 

Static = Tinv Env (Tinv Addr Id), and Dynam = Id 

For the control flow language: 

Static = Label Id, and Dynam = Tcps void (i^t Code (i^t Sto Id)) 

To combine the compiler building blocks for these languages, one simply com- 
bines the respective monad transformers: 

Static = Tinv Env (Tinv Addr {%t Label Id)), and 
Dynam = Tcps void (i^t Code {%t Sto Id)) 
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Now, the specifications for both of the smaller languages, Egsiock and EqcF, 
apply for the “larger” Static and Dynam monads, and so we have the compiler 
for the combined language is specified by Eqsiock^EqcF- 



Compiler: 

Dynam = 12cps void {%t Code Sto Id)), Static = i?Env Env ('3inv Addr Label Id)) 
Language = Expressions U Imperative U Control-flow U Block strncture U Booleans 
Equations = ^^Expressions -^'llmperative 

^'I'Control-flow -^I'Block structure -^'I'Booleans 



Source Code: 



Target Code: 

0 := 5; 

1 := 1 ; 
jump 1; 

1 : 2 := 1 ; 

3 := [0]; 

BRLEQ [2] [3] 2 3; 



new X in new y in 
X := 5-,y := 1; 
while {1 < x) do 

y := y*x\ x := x-1-, 



2: 2 := [1]; 3: halt; 

3 := [0]; 

1 := [2] * [3]; 

2 := [ 0 ]; 

3 := 1; 

0 := [2] - [3]; 
jump 1; 



Fig. 11. Compiler for While language and example compilation 



Code is generated via type-directed partial evaluation [4] using the method 
of Danvy and Vestergaard[5]. Figure 11 contains the compiler for the while lan- 
guage, and an example program and its pretty-printed compiled version. All 
that was necessary was to combine the compiler building blocks developed in 
this section combined as discussed in Section 6. 

7 Correctness 

In this section, we outline an example correctness specification for a reusable 
compiler building block written in metacomputation style. In particular, we il- 
lustrate the advantages w.r.t. compiler correctness of metacomputation-based 
compiler specifications over the monolithic style specifications of [8] and also of 
the general usefulness of monads and monad transformers w.r.t. compiler correct- 
ness. Although lack of space makes a full exposition of metacomputation-based 
compiler correctness impossible here, we hope to convey the basic issues^. 

^ The interested reader may consult the first author’s forthcoming doctoral disserta- 
tion. 
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The correctness of a reusable compiler building block for a source language 
feature is specified by comparing the compilation semantics C|— ] with the stan- 
dard semantics |— ] for that feature. Let us take as an example the conditional 
if-then. Its standard and compilation semantics are presented in Figure 6. A 
(slightly informal) specification of if-then is: If Lc yf Lk and Lc, Lk are unbound 
in the code store, then 

lfThen(|6], |c], Lc, Lk) *d A_.initCode = |if 6 then c] *d A_.initCode 

where initCode = updateCode(A_.77) for arbitrary constant U : Code. Be- 
cause lfThen([&|,[c|,Lc,L,t) will affect the code store and |if b then c] will not, 
lfThen(|&], |c], Lc, Lk) yf |if b then c]. But by “masking out” the code store 
state on both sides with initCode — which sets the code store to constant U — 
we require that both sides of the above equation have the same action on the 
value store Sto. 

The above specification is easier to prove than the analogous one in mono- 
lithic style because the metacomputation-based definition in Figure 6 just stores 
the continuation k while the monolithic-style definition manipulates k as was 
outlined in Sections 2 and 5.2. Furthermore, here is an example of how monad 
transformers help with compiler correctness proofs. Although the above equa- 
tion holds in Dynam = Tcps Void (i^t Label ("^t Sto Id)), other monad transform- 
ers could be applied to Dynam for the purposes of adding new source language 
features and the specification would still hold^. So, the use of monad transform- 
ers in this work yields a kind of proof reuse for metacomputation-based compiler 
correctness [14]. 



8 Conclusions and Future Work 



Metacomputations are a simple and elegant structure for representing staged 
computation within the semantics of a programming language. This paper presents 
a modular and extensible style of language specification based on metacompu- 
tation. This style uses two monads to factor the static and dynamic parts of the 
specification, thereby staging the specification and achieving strong binding-time 
separation. Because metacomputations are defined in terms of monads, they can 
be constructed modularly and extensibly using monad transformers. We exploit 
this fact to create modular compilers. 

Future work focuses on two areas: specifying other language constructs like 
objects, classes, and exceptions; and exploring the use of metacomputations in 
the semantics of two-level languages. 



® Given certain fairly weak conditions on the order of monad transformer application. 
See [12,13,14] for details. 
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Abstract. This paper describes work in progress on the design of an 
ML-style metalanguage FreshML for programming with recursively de- 
fined functions on user-defined, concrete data types whose construc- 
tors may involve variable binding. Up to operational equivalence, val- 
ues of such FreshML data types can faithfully encode terms modulo 
a-conversion for a wide range of object languages in a straightforward 
fashion. The design of FreshML is ‘semantically driven’, in that it arises 
from the model of variable binding in set theory with atoms given by the 
authors in [7]. The language has a type constructor for abstractions over 
names ( = atoms) and facilities for declaring locally fresh names. More- 
over, recursive definitions can use a form of pattern-matching on bound 
names in abstractions. The crucial point is that the FreshML type sys- 
tem ensures that these features can only be used in well-typed programs 
in ways that are insensitive to renaming of bound names. 



1 Introduction 

This paper concerns the design of functional programming languages for meta- 
programming, by which we mean the activity of creating software systems — 
interpreters, compilers, proof checkers, proof assistants, and so on — that manip- 
ulate syntactical structures. An important part of such activity is the design 
of data structures to represent the terms of formal languages. The nature of 
such an object language will of course depend upon the particular application. 
It might be a language for programming, or one for reasoning, for example. But 
one thing is certain: in all but the most trivial cases, the object language will 
involve variable binding, with associated notions of free and bound variables, 
renaming of bound variables, substitution of terms for free variables, and so on. 
It is this aspect of representing object languages in metaprogramming languages 
upon which we focus here. 

Modern functional programming languages permit user-defined data types, 
with pattern matching in definitions of functions on these data types. ^ For object 

^ As far as we know, this feature was introduced into functional programming by Rod 
Burstall: see [2,1]. 
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languages without variable binding, this reduces the work involved in designing 
representations to a mere act of declaration: a specification of the abstract syntax 
of the object language gives rise more or less directly to the declaration of some 
algebraic data types (mutually recursive ones in general). Consider the familiar 
example of the language of terms of the untyped lambda calculus 

t X \tt \ Xx.t (1) 

and a corresponding ML data type 

datatype Itree = Vr of string (2) 

I Ap of Itree * Itree 
I Lm of string * Itree 

where x ranges over some fixed countably infinite set of variable symbols which 
we have chosen to represent by values of the ML type string of character strings. 
This gives a one-one representation of the abstract syntax trees of all (open or 
closed) untyped lambda terms as closed ML values of type Itree. However, the 
ML declaration takes no account of the fact that the term former Aj:.(— ) involves 
variable binding. Thus, if one wishes to identify terms of the object language up 
to renaming of bound variables (as one often does), such representations are too 
concrete. It is entirely up to programmers to ensure that their term manipulating 
programs respect the renaming discipline — an obligation which becomes irksome 
and error prone for complex object languages, or large programs. 

A common way round this problem is to introduce a new version of the 
object language that eliminates variable binding constructs through the use of 
de Bruijn indices [4]. For example, ‘nameless’ lambda terms are given by 

t' \\t' (3) 

and a corresponding ML data type by 

datatype Itree’ = Vr’ of nat (4) 

I Ap’ of Itree’ * Itree’ 

I Lm’ of Itree ’ 

where the indices n are natural numbers, represented by the values of a suitable 
ML data type nat. Closed ML values of type Itree’ correspond to nameless 
terms t' , which in turn correspond to a-equi valence classes of ordinary lambda 
terms t (open or closed). Hence functions manipulating lambda terms modulo 
a-conversion can be defined, and their properties proved, using structural recur- 
sion and induction for the algebraic data type Itree ’ . This approach has been 
adopted for a number of large systems written in ML involving syntax manip- 
ulation (such as HOL [8] and Isabelle [14], for example). However, it does have 
some drawbacks. Firstly, nameless terms are hard for humans to understand and 
they need translation and display functions relating them to the usual syntax 
with named bound variables. Secondly, some definitions (such as substitution 
and weakening the context of free variables) are non-intuitive and error-prone 
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when cast in terms of de Bruijn indices. Lastly, and most importantly, the ML 
language does not have any built-in support that might alleviate these problems: 
one usually starts with a specification of an object language in terms of context 
free grammars and some indication of the binding constructs and has to craft its 
‘name free’ representation by hand. Perhaps more can be done automatically? 
In this paper we describe an ML-like language with features that address these 
difficulties and provide improved automatic support for metaprogramming with 
variable binding constructs. The key innovation is to deduce at compile-time not 
only traditional type information, but also information about the ‘freshness’ of 
object-level variables. This information is used to guarantee that at run-time 
the observable behaviour of well- typed meta-level expressions is insensitive to 
renaming bound object-level variables. Thus, users are notified at compile-time 
if their syntax-manipulating code descends below the level of abstraction which 
identifies a-equi valent object-level expressions. 

Our language design is guided by the mathematical model of binding oper- 
ations introduced in [7] using a Fraenkel-Mostowski permutation model of sets 
with atoms. A key feature of this model is that it provides a syntax-independent 
notion of a name (i.e. an atom) being fresh for a given object. For this rea- 
son the resulting programming language is called FreshML. Figure 1 gives some 
sample FreshML declarations which continue the running example of the un- 
typed lambda calculus.^ They will be used in the rest of this paper to illustrate 
the features of the new language. We attempt to explain FreshML without as- 
suming knowledge of the mathematics underlying our model of binding; for the 
interested reader, the intended model of FreshML is sketched in an Appendix 
to this paper. Sections 2-5 describe the novel features of FreshML compared 
with ML, namely atoms, freshness, atom ahstraetion / concretion and pattern 
matching with abstraction patterns. Sections 6-8 discuss the interaction of these 
features with standard ones for equality, recursive functions and types not in- 
volving atoms. It should be stressed that the design of FreshML is still evolving: 
section 9 discusses some of the possibilities and reviews related work. 



Note (Meta-level versus object-level binding). The metalanguage Fresh- 
ML provides a novel treatment of binding operations in object languages. How- 
ever, in describing FreshML syntax we treat its various binding constructs in a 
conventional way — by first giving their abstract syntax and then defining what 
the free, bound and binding identifiers are in such expressions. This in turn gives 
rise to a conventional definition of the capture-avoiding substitution \expj'x\exp' 
of a FreshML expression exp for all free occurrences of an identifier x in an 
expression exp' . We write fv(exp) for the finite set of free identifiers in exp. 



^ It will be seen from these declarations that the syntax of function declarations (and 
case expressions) in FreshML is more like that of CAML [3] than that of Standard 
ML [12]. 
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(* Lambda terms, modulo alpha conversion. *) 
datatype lam = Var of atm 

I App of lam * lam 

I Lam of [atm] lam; 

(* Encoding of a couple of familiar combinators. *) 
val I = new a in Lam a. (Var a) end; 

val K = new a in new b in Lam a. (Lam b. (Var a)) end end; 

(* A function sub :1am * [atm] lam -> lam 

implementing capture avoiding substitution. *) 
fun sub = 

■[ (t, a. (Var b) ) where b=a => t 
I (t, a. (Var b) ) where b#a => Var b 

I (t, a. (App(u, v) ) ) => App(sub(t, a.u), sub(t, a.v)) 

I (t, a. (Lam b.u)) => Lam b.(sub(t, a.u)) }; 

(* A function cbv: lam -> lam 

implementing call-by-value evaluation. *) 
fun cbv = 

{ App(t,u) => case (cbv t) of -[Lam e => cbv(sub(cbv u, e))]- 
I V => V >; 

(* A function rem: [atm] (atm list) -> (atm list) 

taking a list of atoms with one atom abstracted and removing it. *) 
fun rem = 

■[ a. nil => nil 

I a.(x::xs) where x=a => rem a.xs 
I a.(x::xs) where x#a => x::(rem a.xs) }; 

(* A function fv: lam -> (atm list) 

which lists the free variables of a lambda term, 
possibly with repeats. *) 
fun fv = 

■[ Var a => a: :nil 
I App(t,u) => append(fv t) (fv u) 

I Lam a.t => rem a. (fv t) }; 

(* Unlike the previous function, the following function, which tries 
to list the bound variables of a lambda term, does not type check 

good! *) 

fun bv = 

■[ Var a => nil 

I App(t,u) => append(bv t) (bv u) 

I Lam a.t => a: : (bv t) }; 



Fig. 1. Sample FreshML declarations 
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2 Freshness 

Variables of object languages are represented in FreshML by value identifiers 
of a special built-in type atm of atoms.^ Operationally speaking, atm behaves 
somewhat like the ML type unit ref, but the way in which dynamically created 
values of type atm (which are drawn from a fixed, countably infinite set A = 
{a, a', . . .} of semantic atoms) can be used is tightly constrained by the Fresh- 
ML type system, as described below. Just as addresses of references do not occur 
explicitly in ML programs, semantic atoms do not occur explicitly in the syntax 
of FreshML. Rather, they can be referred to via a local declaration of the form 

new a in exp end (5) 

where a is an identifier implicitly of type atm. This is a binding operation: 
free occurrences of a in the expression exp are bound in new a in exp end. 
Its behaviour is analogous to the Standard ML declaration 

let val a = ref 0 in exp end (6) 

in that the expression in (5) is evaluated by associating a with the first semantic 
atom unused by the current value environment and then evaluating exp in that 
augmented value environment. (We formulate this more precisely at the end of 
this section.) As in ML, evaluation in FreshML is done after type checking, and it 
is there that an important difference between the expressions in (5) and (6) shows 
up. Compared with Mb’s type-checking of the expression in (6), the FreshML 
type system imposes a restriction on the expression in (5) which always seems to 
be present in uses of ‘fresh names’ in informal syntax-manipulating algorithms, 
namely that 

expressions in the scope of a fresh name a only use it in ways that are 
insensitive to renaming. 

For example, although let val a = ref () in a end has type unit ref in ML, the 
expression new a in a end is not typeable in FreshML — the meaning of a is clearly 
sensitive to renaming a. On the other hand, in the next section we introduce 
atom-abstraction expressions such as a . a, whose meaning (either operationally 
or denotationally) is insensitive to renaming a even though they contain free 
occurrences of a, giving rise to well typed expressions such as new a in a . a end. 
(Some other examples of well typed new-expressions are given in Fig. 1.) 

Type System. To achieve the restrictions mentioned above, the FreshML type 
system deduces for an expression exp judgments not only about its type, T h 
exp : ty, but also about which atoms a are fresh with respect to it, 

r h exp # a. (7) 

In the current experimental version of FreshML, there is only one such type. Future 
versions will allow the programmer to declare as many distinct copies of this type as 
needed, for example, for the distinct sorts of names there are in a particular object 
language. 
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Here a is some value identifier assigned type atm by the typing context F. Fresh- 
ML typing contexts may contain both typing assumptions about value identifiers, 
X : ty, and freshness assumptions about them, x ^ a (if _T contains such an 
assumption it must also contain a : atm and x : ty for some type ty). The 
intended meaning of statements such as ‘x ^ a’ is that, in the given value 
environment, the denotation of x (an element of an FM-set) does not contain 
the semantic atom associated to a in its support — the mathematical notions 
of ‘FM-set’ and ‘support’ are explained in the Appendix (see Definitions A.l 
and A. 2). Here we just give rules for inductively generating typing and freshness 
judgements that are sound for this notion. In fact it is convenient to give an 
expression’s typing and freshness properties simultaneously, using assertions of 
the form exp : ty ^ {ai, . . . , a„} standing for the conjunction 

exp : ty & exp # ai & • • • & exp # a„ . 

Thus the FreshML type system can be specified using judgments of the form 

r h exp : ty (8) 

where 

T = (xi : ty^ # ai), . . . , (x„ : ty„ # a„) (9) 

and 

~ xi, . . . ,x„ are distinct value identifiers which include all the free identifiers 
of the FreshML expression exp; 

— tpi, . . . ,ty^ and ty are FreshML types^^; 

— ELi , . . . , a„ and a are finite sets of value identifiers with the property that 
each a S ai U . . . U a„ U a is assigned type atm by the typing context, i.e. is 
equal to one of the x^ with ty^ = atm. 

We write F h exp : ty as an abbreviation for F h exp : ty ^ When discussing 
just the freshness properties of an expression, we write F h exp # a to mean 
that F h exp : ty # {a} holds for some type ty. 

The rule for generating type and freshness information for new-expressions 
is (11) in Fig. 2. The notation a : atm 0 F used there indicates the context 
obtained from F by adding the assumptions a : atm and x # a for each value 
identifier x declared in F (where we assume a does not occur in F). For example, 
if r = (x : atm), {y : ty # {x}), then 

a : atm 0 T = (a : atm), (x : atm ^ {a}), {j '■ ty ^ {a, x}) . 

The side condition a ^ dom(T) U a in rule (11) is comparable to ones for more 
familiar binding constructs, such as function abstraction; given F and a, within 
the a-equivalence class of the expression new a in exp end we have room to 
choose the bound identifier a so that the side condition is satisfied. 

In the current experimental version of FreshML, we just consider monomorphic types 
built up from basic ones like atm and string using products, functions, recursively 
defined data type constructions and the atom-abstraction type constructor described 
in the next section. Introducing type schemes and ML-style polymorphism seems 
unproblematic in principle, but remains a topic for future work. 
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(x ■. ty ^ a.') £ r a C a' 
r\-x: ty # a 

a : atm (g) _T h exp : ty ^ ({a} U a) a ^ dom(P) U a 
r h (new a in exp end) \ ty ^a 

r h exp : ty ^ (a \ {a}) -T(a) = atm 

r \- a. exp : [atm] ty ^ a 

r h exp : [atm] ty ^ ({a} U a) Aha: atm ^ a 
r h exp @ a : ty #a 

r h exp : [atm] ty 41= a a, x ^ dom(_T) U a' 

(a : atm (g) F), {x : ty # a) \~ exp' : ty' # ({a} U a') 
r h case exp of {a.x => exp'} : ty' 44= a' 

F{a) = _r(b) = atm F h exp : ty 44= -T[a 44='°]'^ ^xp' : ty 44 
F h ifeq(a,b) then exp else exp' \ ty 44~^ 

F(a) = atm for all a G a 
_r h ( ) : unit 44 ^ 

F h egpi ■. ty^44a T h exp^ ■.ty^=44a 
F h (eapi , exp^) : ty^ *ty^44a 

F\- exp : ty^ *ty^=44'^ x, y ^ dom(r) 
r, (x : ty^ 44 a), (y : ty^ # a) h exp' : ty' # a' 

F h case exp of {(x,y) => exp'} : ty' 44 

F h exp : ty 44'^ r, (x : ty 44 ^ sxp' : ty' 44'^' x ^ dom(P) 

F h let val X = exp in exp'end : ty' 44 

F, (f : ty -> ty'), (x : ty) h exp ■. ty' f , x ^ dom(_r) 

F{a) = atm, for all a € a F \~ Xi : ty^ 44 A for all Xi G fv(ea;p) \ {f , x} 
F h fun f = {x => exp} : [ty -> ty') 44 ^ 

F h egpi : {ty -> ty') 44 a F exp^ ■ty44a 
F h etpj exp 2 ■■ ty' 44~^ 

ty pure F h exp : ty F{a) — atm, for all a G a 
F h exp ■. ty 44'^ 

Fig. 2. Excerpt from the type system of FreshML 
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To understand rule (11) better, consider the special case when F and a are 
empty. This tells us that a closed expression of the form new a in exp end has 
type ty if not only a : atm h exp : ty, but also a : atm h exp ^ a. This second 
property guarantees that although exp may involve the atom a, its meaning is 
unchanged if we rename a — and hence it is meaningful to ‘anonymise’ a in exp 
by forming the expression new a in exp end. 

But how do we generate the freshness assertions needed in the hypothesis of 
rule (11) in the first place? One rather trivial source arises from the fact that if 
a is fresh for all the free identifiers in an expression exp, then it is fresh for exp 
itself (this is a derivable property of the FreshML type system); so in particular 
a is always fresh for closed expressions. However, it can indeed be the case that 
r h exp ^ a holds with a occurring freely in exp. Section 3 introduces the 
principal source of such non-trivial instances of freshness. 

Operational Semantics. At the beginning of this section we said that the opera- 
tional behaviour of the FreshML expression new a in exp end is like that of the 
ML expression let val a = ref () in exp end. In order to be more precise, we 
need to describe the operational semantics of FreshML. The current experimen- 
tal version of FreshML is a pure functional programming language, in the sense 
that the only effects of expression evaluation are either to produce a value or to 
not terminate. We describe evaluation of expressions exp using a judgement of 
the form 

E h exp V 

where if is a value environment (whose domain contains the free value identifiers 
of exp) and u is a semantic value. These are mutually recursively defined as 
follows: if is a finite mapping from value identifiers to semantic values; and, 
for the fragment of FreshML typed in Fig. 2, we can take v to be given by the 
grammar 

v ::= a \ abs{a, v) \ unit \ pr{v, v) \ fun{x, x, exp, E) 

where a ranges over semantic atoms, x over value identifiers, and exp over ex- 
pressions. The rule for evaluating new-expressions is (24) in Fig. 3. The notation 
fa{E) used in that rule stands for the finite set of ‘synthetically’ free semantic 
atoms of a value environment if; this is defined by 

fa{E) = UxGdom(iS)/«(^W) 

where fa{a) = {a}, fa{abs{a,v)) = fa{v) \ {a}, fa{pr{vi,V 2 )) = fa{vi) U fa{v 2 ) , 
fa{umt) = 0 and fa{fun{f,x, exp,E)) = UyGfv(exp)N{f,x} /«(^(y))- 

It should be emphasised that the rules in Fig. 3 are only applied to well- 
typed expressions. Evaluation preserves typing and freshness information in the 
following sense. 

Theorem 2.1 (Soundness of the type system with respect to evalu- 
ation). If r \- exp : ty ff E \- exp v and E : F, then v : ty and 
fa{v) n { if (a) I a G a} = 0. (The definitions of ‘E : F’ and ‘v : ty’ for the 
fragment of FreshML typed in Fig. 2 are given in Fig. 4 ) 
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E(x) = V 
E \- X ^ V 

E[a. I— > a] h exp => u a ^ fo-(E) 

E h new a in exp end => v 

E h exp V E{a) = a 
E \- a. exp abs{a, v) 

E h exp => abs{a',v') E{a) = a v = {a' a) ■ v' 

E h exp @ a => n 

E h exp abs(a, v) a! ^ fa{E) U fa{abs{a, v)) 
v" = {a' a) ■ V E[a i— > a', x i— > v"\ h exp' => v' 

E h case exp of {a.x => exp'} => v' 

E{a) = E(h) E h exp => v 
E h ifeq(a,b) then exp else exp' => v 

E{a) / E{h) E h exp' => v' 

E h ifeq(a,b) then exp else exp' ^ v' 

_E h () ^ unit 

E h espj ^ wi E\- exp2 => «2 
E\- (.expi,exp2) ^ pr(vi,V2) 

E h exp => pr(vi,V 2 ) E[x ti, y U 2 ] b exp' => v' 
E h case exp of {(x,y) => exp'} ^ v' 

E h exp => V E[x I— > t] h exp' => v' 

E h let X = exp in exp' end v' 

E h fun f = {x => exp} => fun(f, x, exp, i?) 

-E h exp]^ => «i Eh exp2 => r>2 
vi = fun{±, X, exp, Ei) Ei[f 1 — > ti, x t 2 ] h exp u 



(23) 

(24) 

(25) 

(26) 

(27) 

(28) 

(29) 

(30) 

(31) 

(32) 

(33) 

(34) 



E h expj exp 2 => v 

Fig. 3. Excerpt from the operational semantics of FreshML 



(35) 
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g £ A 
a : atm 



g G A V ■. ty 
abs{a, v) : [atm] ty 



unit : unit 



Vi : ty^ V2 : ty^ 

pr (vi,V2) ■■ ty^ * ty^ 



r, (f : ty -> ty'), (x : ty) h exp ■. ty -> ty' f , x ^ dom(_r) E : E 
fun(f,x, exp,E) : ty -> ty' 

dom(_E) = dom(_T) 

E{xi) : ty^ and fa{E{xi)) n { -E(a) | a G ai } = 0, for all (xi : ty^ £ E 

E : E 

Fig. 4. Typing semantics values and value environments 



3 Atom Abstraction 

FreshML user-declared data types can have value constructors involving bind- 
ing, via a type constructor [atm] (— ) for atom abstractions. The data type 
lam declared in Fig. 1 provides an example of this, with its constructor Lam : 
[atm] lam -> lam. Expressions of an atom abstraction type [atm] ty are intro- 
duced with a syntactic form which is written a. exp, where a is a value identifier 
of type atm and exp an expression of type ty. Such atom abstraction expressions 
behave like pairs in which the first component is hidden, in a way comparable 
to hiding in abstract data types [13]. The operations for accessing the second 
component are discussed in Sects 4 and 5. We claim that two such expressions, 
a. exp and a' .exp', are contextually equivalent (i.e. are interchangeable in any 
complete FreshML program without affecting the observable results of evaluating 
it) if and only if 

for some (any) fresh a", (a" a) • exp and (a" a') • exp' are contextually 

equivalent expressions of type ty 

where (a" a) • exp indicates the expression obtained by interchanging all occur- 
rences of a" and a in exp. It is for this reason that values of type Icun correspond 
to a-equivalence classes of lambda terms: see [7, Theorem 2.1]. 

Atom abstraction expressions a. exp are evaluated using rule (25) in Fig. 3; 
and their typing and freshness properties are given by rule (12) in Fig. 2. In that 
rule, the notation a \ {a} means the finite set { a' G a j a' ^ a}; and the side- 
condition r(a) = atm means that, with E as in equation (9), a = for some i 
with ty^ = atm. To understand rule (12) better, consider the special case when 
a = {a}: then the rule tells us that provided T(a) = atm and exp is typeable in 
context r, then a is always fresh for a. exp, i.e. E h (a. exp) ff a. This is the 
principal source of freshness assertions in FreshML. For example: 

Example 3.1. Given the declarations in Fig. 1 and some straightforward rules 
for typing data type constructors (which we omit, but which are analogous to 
the rules (16) and (17) for unit and pairs in Fig. 2), from rule (12) we have 

a : atm F (a. (Var a)): [atm] lam ^ {a} 
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and then 

a : atm h (Lama. (Var a)) : lam # {a}. 

Applying rule (11) to this yields 

h (new a in Lama.(Var a) end) : lam. 

This closed expression of type lam is a FreshML representation of the lambda 
term Xa.a. 

Note that atom abstraction is not a binder in FreshML: the free identifiers of 
a. exp are a and all those of exp. The syntactic restriction that the expression to 
the left of the ‘abstraction dot’ be an identifier is needed because we only consider 
freshness assertions ^exp ^ a’ with a an identifier rather than a compound 
expression (in order to keep the type system as simple as possible). This does 
not really restrict the expressiveness of FreshML, since a more general form of 
atom abstraction ^atexp.exp’ (with atexp of type atm) can be simulated with 
the let-expression let val a = atexp in a. exp end. (The typing rule for let- 
expressions is (19) in Fig. 2.) 

Remark 3.2 (binding = renameability -|- name hiding). Example 3.1 
illustrates the fact that, unlike metalanguages that represent object-level binding 
via lambda abstraction, FreshML separates the renaming and the hiding aspects 
of variable binding. On the one hand a is still a free identifier in a. exp, but on the 
other hand the fact that new a in — end is a statically scoped binder can be used 
to hide the name of an atom (subject to the freshness conditions discussed in 
Sect. 2). We illustrate why this separation of the renaming and the hiding aspects 
of variable binding can be convenient in Example 4.1 below. To give the example 
we first have to discuss mechanisms for computing with expressions of atom 
abstraction type [atm]fy. FreshML offers two related alternatives: concretion 
expressions, exp 0 a, and case-expressions using abstraction patterns, such as 
case exp of {a.x => exp'}. We discuss each in turn. 

4 Concretion 

Values of atom abstraction type have a double nature. So far we have seen their 
pair-like aspect; but as noted in [7, Lemma 4.1], they also have a function-like 
aspect: we can choose the name a of the first component in an atom abstraction 
exp : [atm] ty as we like, subject to a certain freshness restriction, and then 
the second component turns out to be a function of that choice, which we write 
as a h— !■ exp 0 a. We call exp 0 a the concretion^ of the atom abstraction exp 
at the atom a. The typing and freshness properties of concretions are given by 
rule (13) in Fig. 2. Note in particular (taking a = 0 in the rule) that given 
r h exp : [atm] ty, in order to deduce F h exp : ty we need to know not only 
that r{a) — atm, but also that F h exp # a. The denotational justification for 



® The terminology is adopted from [11, Sect. 12.1]. 
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this is given by Proposition A. 2 of the Appendix. Operationally, the behaviour 
of concretion is given by rule (26) in Fig. 3. Thus evaluation of exp 0 a proceeds 
by first evaluating exp; if a semantic value of the form abs{a',v') is returned 
and the semantic atom associated with a in the current value environment is a, 
then the result of evaluating exp 0 a is the semantic value (a' a) • v' obtained 
from v' by interchanging all occurrences of a' and a in v' . By analogy with (3- 
conversion for A-abstraction and application, it is tempting to replace the use of 
transposition (a' a)- (—) by substitution [a/ a'] (— ), but this would not be correct. 
The reason for this has to do with the fact that while a. (— ) is used to represent 
binding in object languages, it is not itself a binding operation in FreshML; so 
the substitution [a/ a']{—) can give rise to capture at the object-level in a way 
which (a' a) ■ (— ) cannot. Here is an example to illustrate this: the result of 
evaluating 

(a^ . (a. (Var a^))) 0 a 

in the value environment if = {a a, a' i— > a'} is abs{a' , Var a). Using [a/ a']{—) 
instead of {a' a) ■ (—) one would obtain the wrong value, namely a6s(a, Var a), 
which is semantically distinct from a&s (a', Var a) (in as much as the two semantic 
values have different denotations in the FM-sets model — see Sect. A. 4). 

Here is an example combining atom abstraction, concretion and local fresh- 
ness expressions (together with standard case-expressions and function decla- 
ration) . 

Example 4.1. One of the semantic properties of the atom abstraction set- 
former in the model in [7] (and the related models in [5]) which distinguish it 
from function abstraction is that it commutes with disjoint unions up to natural 
bijection. We can easily code this bijection in FreshML as follows. 

(* A type constructor for disjoint unions. *) 
datatype (’a,’b)sum = Ini of ’a I Inr of ’b; 

(* A bijection i : [atm] ( ( ’ a, ’b) sum) -> ( [atm] ’a, [atm] ’b) sum. *) 
fun i = { e => new a in 

case e0a of 

{ Ini X => Inl(a.x) 

I Inr y => Inr(a.y) } 
end } 

This illustrates the use of the fact mentioned in Remark 3.2 that name abstrac- 
tion and name hiding are separated in FreshML: note that in the definition of 
i, the locally fresh atom a is not used in an abstraction immediately, but rather 
at two places nested within its scope (a.x and a.y). 

The bijection in this example can be coded even more perspicuously using 
pattern matching: 

fun i’ = { a. (Ini x) => Inl(a.x) 

I a. (Inr y) => Inr(a.y) } 

Expressions like ‘a. (Ini x)’ are abstraction patterns. The fact that there is a 
useful matching mechanism for them is one of the major innovations of FreshML 
and we discuss it next. 
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5 Matching with Atom Abstraction Patterns 

It is not possible to split semantic values of type [atm] ty into (atom,value)-pairs 
uniquely, because given abs{a, v) then for any a' ^ fa,{v), abs{a, v) has the same 
denotation as abs{a' , (a' a) ■ v). However, if we only use the second component 
(a' a) -V in a way that is insensitive to which particular fresh a' is chosen, we get 
a well-defined means of specifying a function on atom abstractions via matching 
against an abstraction pattern. The simplest example of such a pattern takes the 
form a.x, where a and x are distinct identifiers. Rule (14) in Fig. 2 gives the 
typing and freshness properties and rule (27) in Fig. 3 the evaluation properties 
for a case-expression with a single match using such a pattern. (In the expression 
case exp of {a.x => exp'}, the distinct identifiers a and x are binders with exp' 
as their scope.) 

Figure 1 gives some examples of declarations involving more complicated, 
nested abstraction patterns. We omit the formal definition of matching against 
such patterns, but the general idea is that atom identifiers to the left of an 
‘abstraction dot’ in a pattern represent semantic atoms that are fresh in the 
appropriate sense; and by checking freshness assertions, the type system ensures 
that the expression to the right of ‘=>’ in a match uses such identifiers in a way 
that is insensitive to renaming. For example, this implicit freshness in matching 
is what ensures that sub in Fig. 1 implements capture- avoiding substitution — in 
the last match clause, b is automatically fresh for t and so it makes sense to 
apply the substitution function sub(t, a.—) under Lam b.— . Another example 
is the declaration bv in Fig. 1, which does not type check because in the last 
match clause a is not fresh for a: : (bv t) . 

In the current experimental version of FreshML, all uses of abstraction pat- 
terns are eliminated by macro-expanding them using concretion and local fresh- 
ness. For example, as rules (14) and (27) may suggest, case exp of {a.x => exp'} 
can be regarded as an abbreviation for 

new a^ in case exp @ a^ of {x => [a! / a\exp'} end 

(where a' ^ bf{exp)). However, to accommodate the more general notions of 
abstraction mentioned at the end of Sect. 9, we expect that matching with 
abstraction patterns will have to be a language primitive. 

Remark 5.1 (Comparison with Standard ML). According to its Defini- 
tion [12], in Standard ML during type checking a pattern pat elaborates in the 
presence of a typing context T to a piece of typing context F' (giving the types 
of the identifiers in the pattern) and a type ty (the overall type of the pattern) ; 
then a match pat => exp' elaborates in the context F to a function type ty -> ty' 
if exp' has type ty' in the augmented context F 0 F' . Pattern elaboration is a 
little more complicated in FreshML. For abstraction patterns generate not only a 
piece of typing context F' , but also two kinds of freshness assumptions: ones that 
modify F (cf. the use of a : atm0F in rule (14)); and ones that impose freshness 
restrictions on exp' in a match pat => exp' (cf. the use of exp' : ty' # ({a} U a') 
in rule (14)). 
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Example 5.2. In Example 4.1 we gave an example where the use of abstraction 
patterns allows a simplification compared with code written just using the com- 
bination of new-expressions and concretion. Sometimes the reverse is the case. 
For example, in informal practice when specifying a function of finitely many 
abstractions, it is convenient to use the same name for the abstracted variable 
(and there is no loss of generality in doing this, up to a-conversion) . This is 
not possible using FreshML patterns because, as in ML, we insist that they be 
linear: an identifier must occur at most once in a pattern. However, it is possible 
through explicit use of a locally fresh atom. Here is a specific example. 

In the FM-sets model (see the Appendix), the atom abstraction set-former 
commutes with cartesian products up to natural bijection. We can code this 
bijection in FreshML using pattern-matching as follows. 

(* A bijection ( [atm] ’ a) * ( [atm] ’b) -> [atm] (’a * ’b) *) 

fun pi = { (a.x, b.y) => b.((a.x)@b, y) } 

Better would be 

fun p2 = { (e, b.y) => b.(e@b, y) } 

but an arguably clearer declaration (certainly a more symmetric one) uses local 
freshness explicitly: 

fun p3 = { (e, f) => new a in a. (e@a, f@a) end } 

Simplest of all would be the declaration 
fun p4 = { (a.x, a.y) => a. (x, y) } 

but this is not legal, because (a.x, a.y) is not a linear pattern. As we dis- 
cuss in the next section, atm is an equality type; so matching patterns with 
repeated occurrences of identifiers of that type is meaningful (although patterns 
like (a.x, a.y) involve a further complication, in that the repeated identifier is 
in a ‘negative position’, i.e. to the left of the ‘abstraction dot’). We have insisted 
on linear patterns in the current version of FreshML in order not to further com- 
plicate a notion of matching which, as we have seen, is already more complicated 
than in ML. 



6 Atom Equality 

Algorithms for manipulating syntax frequently make use of the decidability of 
equality of names (of object level variables, for example). Accordingly, the type 
atm of atoms in FreshML admits equality. In particular if atexp and atexp' are 
two expressions of type atm, then eq( atexp, atexp') is a boolean expression which 
evaluates to true if atexp and atexp' evaluate to the same semantic atom and 
evaluates to false if they evaluate to different ones. What more need one say? In 
fact, when it comes to type checking there is more to say. To see why, consider 
the following declaration of a function taking a list of atoms with one atom 
abstracted and removing it. 
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fun rem2 = { e => new a in 

case e@a of 
{ nil => nil 

I b::bs => ifeq(b,a) then (rem2 a.bs) 
else b;;(rem2 a.bs) } 

end } 

This makes use of a form of conditional 

ifeq(a,b) then exp else exp' 

which branches on the equality of a and b — see rules (28) and (29) in Fig. 3. For 
the above declaration of rem2 to type-check as a function [atm] (atm list) -> 
(atm list), one has to apply rule (11) to the subphrase new a in ...end. 
Amongst other things, this requires one to prove 

(ifeq(b,a) then (rem2 a.bs) else b: : (rem2 a.bs)) # a 

in a typing context whose only freshness assumptions are rem2 ^ a and e ^ a. 
For this to be possible we have to give a typing rule for if eq(a, b) then — else — 
which, while checking the second branch of the conditional, adds the semantically 
correct information a # b to the typing context. Such a typing rule is (15) in 
Fig. 2. (This uses the notation F[a # b] to indicate the typing context obtained 
from r by adding the assumption a # b; we omit the straightforward formal 
definition of this for the typing contexts defined as in equation (9).) Although 
we take account of the fact that a and b denote distinct atoms when checking 
the second branch of the conditional, we have not found a need in practice to 
take account of the fact that they denote the same atom when checking the first 
branch (by strengthening the second hypothesis of this rule to [a/b](F h exp : 
ty a), for example.) 

To get information about atom equality to where it is needed for type check- 
ing, FreshML also permits the use of guarded patterns 

pat where a = b and pat where a # b 

where a and b are value identifiers of type atm. Such guards are inter-definable 
with the ifeq( — , — ) then — else — construct, but often more convenient. 
Figure 1 gives several examples of their use. We omit the precise definition of 
matching such guarded patterns. Note also that the atom equality test expression 
eqiatexp, atexp') can be regarded as a macro for 

let val a = atexp in 

let val b = atexp' in 

ifeq(a,b) then true else false 

end 
end . 
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7 Functions 

Recall from Sect. 2 that the intended meaning of freshness assertions in Fresh- 
ML has to do with the notion of the ‘support’ of an element of an FM-set 
(see Definition A.l in the Appendix). The nature of this notion is such that in 
any reasonable (recursively presented) type system for FreshML, the provable 
freshness assertions will always be a proper subset of those which are satisfied 
by the FM-sets model. This is because of the logical complexity of the statement 

‘the semantic atom associated with the identifier a is not in the support 
of the denotation of the function expression fn{x => exp}’ 

which involves extensional equality of mathematical functions. So if ‘provable 
freshness’ only gives an approximation to ‘not in the support of’, we should 
expect that not every denotationally sensible expression will receive a type. (Of 
course, such a situation is not uncommon for static analyses of properties of 
functional languages.) 

What approximation of the support of a function should be used to infer 
sound freshness information for function expressions in FreshML? We certainly 
want type-checking to be decidable. In the current experimental version of Fresh- 
ML, we take a simple approach (which does ensure decidability) making use of 
the following general property of freshness 

if a is fresh for all the free identifiers in an expression exp, then it is 
fresh for exp itself 

which is certainly sound for the denotational semantics in FM-Set. Applying 
this in the case when exp is a recursive function expression 

fun f = {x => exp} (36) 

we arrive the typing rule (20) given in Fig. 2. As usual, free occurrences of f 
and X in exp become bound in the expression (36). We can regard non-recursive 
function expressions fn{x => exp} as the special case of expression (36) in which 
f f: iv{exp). 

Example 7.1. Consider the following FreshML declaration of a function fv2 
for computing the list (possibly with repetitions) of free variables of a lambda 
term encoded as a value of the data type lam in Fig. 1. 

fun fv2 = 

{ Var a => a: mil 
I App(t,u) => append(fv2 t) (fv2 u) 

I Lam a.t => remove a (fv2 t) } 

This uses auxiliary functions append for joining two lists, and remove : atm -> 
(atm list) -> (atm list) for removing an atom from a list of atoms (using the 
fact that atm is an equality type), whose standard definitions we omit. 
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One might hope that fv2 is assigned type lam->(atm list), but the current 
FreshML type system rejects it as untypeable. The problem is the last match 
clause, Lam a.t => remove a (fv2 t) . As explained in Sect. 5, for this to type 
check we have to prove 

(a : atm), (fv2 : (lam -> (atm list)) # {a}), (e : [atm] lam # {a}) 

h remove a (fv2(e@a) ): atm list # {a} (37) 

This is denotationally correct, because the denotation of remove maps a semantic 
atom a and a list of semantic atoms as to the list as \ {a}; and the support of 
the latter consists of all the semantic atoms in the list as that are not equal to 
a. However, the typing rules in Fig. 2 are not sufficiently strong to deduce (37). 

It seems that this problem, and others like it, can be solved by using a 
richer notion of type in which expressions like Hy # a’ become first-class types 
which can be mixed with the other type constructors (in particular, with function 
types) . We have made some initial investigations into the properties of such richer 
type systems (and associated notions of subtyping induced by making ty ^ & 
a subtype of ty), but much remains to be done. However, for this particular 
example there is a simple work-around which involves making better use of atom- 
abstractions and the basic fact (12) about their support. Thus the declaration 
of fv in Fig. 1 makes use of the auxiliary function rem : [atm] (atm list) -> 
(atm list) for which 

(a : atm), (fv : (lam -> (atm list)) ^ {a}), (e : [atm] lam ^ {a}) 

F (rem a. (fv(eOa) )) : atm list # {a} 

can be deduced using (12). It follows that fv does yield a function of type 
lam -> (atm list) (and its denotation is indeed the function returning the list 
of free variables of a lambda term) . 

Remark 7.2. Note that freshness is not a ‘logical relation’: just because a func- 
tion maps all arguments not having a given atom in their support to results not 
having that atom in their support, it does not follow that the atom is fresh for 
the function itself. Thus the following rule is unsound. 

r, (f : {ty -> ty') # a), (x : ty # a) h exp : ty' 
f,x^dom(r) 

(wrong!) 

r F fun f = {x => exp} : {ty -> ty') # a 

To see this, consider the following example. From rule (12) we have 

a : atm, x : atm ^ {a} F a.x : [atm] atm # {a} 

and so using the above rule (wrong!) we would be able to deduce 

a : atm F fn{x => a.x} : (atm -> [atm] atm) # {a}. 

This is denotationally incorrect, because the denotation of fn{x => a.x} does 
contain the denotation of a in its support. Correspondingly, the operational be- 
haviour of this function expression depends on which semantic atom is associated 
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with a: if we replace a by a' in fn{x => a.x}, then under the assumption a ^ a' 
we get a contextually inequivalent function expression, fn{x => a'.x}. For ap- 
plying these function expressions to the same argument a yields contextually 
inequivalent results (in the first case an expression equivalent to a . a and in the 
second case one equivalent to a' . a) . 

8 Purity 

Consider the following declaration of a function count for computing the number 
of lambda abstractions in a lambda term encoded as a value of the data type 
lam in Fig. 1. 

fun count = 

{ Var a => 0 

I App(t,u) => (count t)+(count u) 

I Lam a.t => (count a)+l } 

For this to type check as a function lam-> int, the last match clause requires 

(a : atm), (count : (lam -> int) # {a}), (t : lam) 

h ((count t) + l) : int # {a} 

to be proved. The interpretation of this judgement holds in the FM-sets model 
because the denotation of int is a ‘pure’ FM-set, i.e. one whose elements all 
have empty support. Accordingly, we add rule (22) in Fig. 2 to the FreshML 
type system; using it, we do indeed get that count has type lam -> int. The 
condition Hy pure’ in the hypothesis of this rule is defined by induction on the 
structure of the type ty and amounts to saying that ty does not involve atm or 
->® in its construction. 

The current experimental version of FreshML is a pure functional program- 
ming language, in the sense that the only effects of expression evaluation are 
either to produce a value or to not terminate. This has an influence on the 
soundness of rule (22). For example, if we add to the language an exception 
mechanism in which exception packets contain values involving atoms, then it 
may no longer be the case that an integer expression exp : int satisfies exp # a 
for any atom a. To restore the soundness of rule (22) in the presence of such 
computational effects with non-trivial support, one might consider imposing a 
‘value restriction’, by insisting that the rule only applies to expressions exp 
that are non-expansive in the sense of [12, Sect. 4.7]. However, note that the 
rule (19) for let-expressions rather undoes such a value-restriction. For using 
rule (19), the freshness properties of exp which we could have deduced from the 
unrestricted rule (22) can be deduced for the semantically equivalent expres- 
sion let val X = exp in x end from the value-restricted version. This highlights 

® For Theorem 2.1 to hold, in rule (22) we need that ty pure implies fa{v) = 0 for 
any semantic value v of type ty, excluding the use of function types (as well as atm) 
in pure types is a simple way of ensuring this. 
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the fact that the soundness of rule (19), and also rules (14) and (18) in Fig. 2, 
depends upon the evaluation of exp not producing an effect with non-empty 
support. Should one try to restrict these rules as well? Probably it is better to 
curtail the computational effects. For example, although it is certainly desirable 
to add an exception-handling mechanism to the current version of the language, 
it may be sufficient to have one which only raises packets containing values with 
empty support (character strings, integers, etc). Investigation of this is a matter 
for future work — which brings us to our final section. 



9 Related and Future Work 

Related work 

The model presented in [7] was one of three works on the metamathematics of 
syntax with binders using categories of (pre)sheaves which appeared simulta- 
neously in 1999 — the other two being [5] and [9]. The starting point for these 
works is a discovery, made independently by several of the authors, which can 
be stated roughly as follows. 

The quotient by a-equivalence of an inductively defined set of abstract 
syntax trees (for some signature involving binders) can be given an initial 
algebra semantics provided one works with initial algebras of functors 
not on sets, but on categories of ‘variable’ sets, i.e. certain categories of 
sheaves or presheaves. 

There is a strong connection between initial algebras for functors and recursive 
data types, so this observation should have some consequences for programming 
language design, and more specifically, for new forms of user declared data type. 
That is what we have investigated here. For us, the notion of (finite) support, 
which is a key feature of the model in [7], was crucial — giving rise as it does to 
FreshML’s idiom of computing with freshness. While the presheaf models con- 
sidered in [5] have weaker notions of support (or ‘stage’) than does the FM-sets 
model, it seems that they too can model languages with notions of abstraction 
similar to the one in FreshML (Plotkin, private communication). 

Miller [10] proposed tackling the problems motivating the work in this paper 
by incorporating the techniques of higher order abstract syntax, HO AS [16], 
into an ML-like programming language, ML>,, with intentional function types 
ty => ty' . Compared with HOAS, the approach in [7] and [5] is less ambitious 
in what it seeks to lift to the metalevel: like HOAS we promote object-level 
renaming to the metalevel, but unlike HOAS we leave object-level substitution to 
be defined case-by-case using structural recursion. The advantage is that Fresh- 
ML data types using [atm] ty retain the pleasant recursion/induction properties 
of classical first order algebraic data types: see Sect. 5 of [7]. It is also the case 
that names of bound variables are inaccessible to the MLv programmer, whereas 
they are accessible to a FreshML programmer in a controlled way. 
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Formal properties of PreshML 

In this paper we have mostly concentrated on explaining the ideas behind our 
approach and giving examples, rather than on presenting the formal properties 
of the language. Such properties include: 

1. Decidability of type/freshness inference. 

2. Correspondence results connecting the operational and denotational seman- 
tics. 

3. The correctness of the encoding of the set of terms over a ‘binding signa- 
ture’ [5, Sect. 2], modulo renaming of bound variables, as a suitable Fresh- 
ML data type. For example, values of type Icun in Fig. 1 modulo contextual 
equivalence (or equality of denotations) correspond to a-equivalence classes 
of untyped lambda terms, with free identifiers of type atm corresponding to 
free variables. 

4. Transfer of principles of structural induction for inductively defined FM-sets 
involving atom abstraction (cf. [7, Sect. 5]) to induction principles for Fresh- 
ML data types. More generally, the ‘l/l-quantifier’ introduced in [7] should 
feature in an LCF-style program logic for FreshML. 

Some of the details will appear in the second author’s forthcoming PhD the- 
sis [6]. At this stage, just as important as proving such properties is accumulat- 
ing experience with programming in FreshML, to see if the idiom it provides for 
metaprogramming with bound names modulo renaming is a useful one. 



Future work 

To improve the usefulness of FreshML for programming with bound names, it 
is already clear that we must investigate richer syntactic forms of abstraction. 
At the moment we permit abstraction over a single type of atoms. We already 
noted in Sect. 2 that it would be a good idea to allow the declaration of distinct 
sorts of atoms (for example, to more easily encode distinct sorts of names in an 
object language). Indeed, it might be a good idea to allow atom polymorphism 
via Haskell-style type classes [15], with a type class for ‘types of atoms’. But 
even with a single sort of atoms, there is good reason to consider notions of 
abstraction in which the data to the left of the ‘abstraction dot’ is structurally 
more complicated than just single atoms. For example, some object languages 
use operators that bind varying numbers of variables rather than having a fixed 
‘arity’ (for example, the free identifiers in an ML match m, however many there 
may be, become bound in the function expression fn{TO}). To encode such 
operators we can make use of the following FreshML data type construction 

datatype ’a abs = Val of ’a I Abs of [atm] (’a abs) 

whose values Abs al . (Abs a2. •••Val val) are some finite number of atom- 
abstractions of a value val of type ’ a. When specifying a function on ’ a abs by 
structural recursion, one has to recurse on the list of binders in such a value. 
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whereas in practice one usually wants to recurse directly on the structure of the 
inner most value val. Therefore it would be useful to have ‘atom-list abstrac- 
tion types’ [atm list] ty (denotationally isomorphic to ty abs) and abstraction 
expressions of the form as . exp, where as is a value of type atm list and exp 
is an expression of type ty. But if one abstracts over atom- lists, why not over 
other concrete types built up from atoms? Indeed, in addition to such concrete 
types, it appears to be useful to abstract with respect to certain abstract types, 
such as finite sets of atoms, or finite mappings defined on atoms. So it may be 
appropriate to consider a general form of abstraction type, \_ty'] ty' , for arbitrary 
types ty and ty' . To do so would require some changes to the nature of the 
freshness judgement (7), whose details have yet to be worked out. In fact, the 
FM-sets model contains a set-former for such a general notion of abstraction, so 
there is a firm semantic base from which to explore this extension of FreshML. 
Emphasising this firm semantic base is a good note on which to finish. Hav- 
ing reached this far, we hope the reader agrees that FreshML has some novel 
and potentially useful features for metaprogramming modulo renaming of bound 
variables. But whatever the particularities of FreshML, we believe the real source 
of this potential is the FM-sets model, which appears to provide a simple, elegant 
and useful mathematical foundation for computing and reasoning about name 
binding modulo renaming. It is certainly the case that without it, we would not 
have reached the language design described in this paper. (So please read the 
Appendix!) 

A Appendix: FM-Sets 

Naively speaking, ML types and expressions denote sets and functions. By con- 
trast, FreshML types and expressions are intended to denote FM-sets and equiv- 
ariant functions. This appendix gives a brief review these notions; more details 
can be found in [7]. Of course, the presence of recursive features and computa- 
tional effects in ML means that a denotational semantics for it really involves 
much more complicated mathematical structures than mere sets. Similarly, to 
account for the recursive features of the present version of FreshML, we should 
really give it a denotational semantics using domains and continuous functions in 
the category of FM-sets. For simplicity’s sake we suppress these domain-theoretic 
details here. 

Notation. Let A = {a, a' , . . .} be a fixed countably infinite set, whose elements 
we call semantic atoms. Let denote the group of all permutations of A. Thus 
the elements tt of Sa are bijections from A to itself. The group multiplication 
takes two such bijections tt and tt' and composes them — we write the composition 
of TT followed by tt' as tt ' tt . The group identity is the identity function on A, 
denoted by id a. 

Recall that an action of the group Sa on a set X is a function (— ) (— ) 

mapping pairs (tt, x) € Sa x X to elements tt -x x G X and satisfying 

'X {x -X x) = (tt'tt) -X X and Ma -x x = x 
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for all 7T,7r' S 5 a and x € X. For example, if A is the set of syntax trees of 
lambda terms with variable symbols from A 



then there is a natural action of 5 a on A\ for each tt G 5a and t G rl, tt -a t is the 
tree which results from permuting all the atoms occurring in t according to tt. 

In general, an action of 5 a on a set X gives us an abstract way of regarding the 
elements a; of X as somehow ‘involving atoms from A in their construction’, in as 
much as the action tells us how permuting atoms changes x — which turns out to 
be all we need for an abstract theory of renaming and binding. An important part 
of this theory is the notion of finite support. This generalises the property of an 
abstract syntax tree that it only involves finitely many atoms in its construction 
to the abstract level of an element of any set equipped with an 5A-action. 

Definition A.l (Finite support). Given a set X equipped with an action 
of 5a, a set of semantic atoms a; C A is said to support an element x € X if all 
permutations tt S 5a which fix every element of lu also fix x: 



We say that x is finitely supported if there is some finite subset w C A supporting 
it. It is not too hard to show that if x is finitely supported, then there is a smallest 
finite subset of A supporting it: we call this the support of x, and denote it by 
suppxix). 

Definition A. 2 (The category FM-Set). An FM-set is a set X equipped 
with an action of the permutation group 5 a in which every element a; € A is 
finitely supported. These are the objects of a category, FM-Set, whose mor- 
phisms / : X — > Y are equivariant functions, i.e. functions from X to Y 
satisfying 

/(tt -X x) =tt -y f{x) 

for all 7T G 5 a and x £ X. 

Example A. 3. The set A of untyped lambda terms, defined as in (38) and with 
the 5A-action mentioned there, is an FM-set. The support of t G A is just the 
finite set of all variable symbols occurring in the tree t (whether free, bound, or 
binding). Note that if two lambda terms are o-equi valent, t =a t' , then for any 
permutation tt one also has tt -a t =a tt -a t' . It follows that (— ) -a (— ) induces an 
action on the quotient set A!=a. It is not hard to see that this is also an FM-set, 
with the support of an a-equivalence class of lambda terms being the finite set 
of free variable symbols in any representative of the class (it does not matter 
which). This turns out to be the denotation of the data type lam declared in 
Fig. 1 (using [7, Theorem 5.1]). 

It should be emphasised that the above definitions are not novel, although 
the use to which we put them is. They are part of the rich mathematical theory 



A = {t ::= a \ tt \ Xa.t {a G A)} 



(38) 



(Va G uj . 7r(a) = a) ^ tt -x x = x . 



(39) 
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of continuous actions of topological groups. S'a has a natural topology as a 
subspace of the countably infinite product of the discrete space A, which makes 
it a topological group. Given an action of S'a on a set X, all the elements of X 
have finite support if and only if the action is a continuous function S/^xX — > X 
when X is given the discrete topology. Thus FM-Set is an example of a category 
of ‘continuous G-sets’ and as such, much is known about its properties: see [7, 
Sect. 6] for references. Here we just recall its cartesian closed structure. 

A.l Products in FM-Set 

These are given by taking the cartesian product of underlying sets 

X X Y ={ (x,y) \ X G X and y GY} . 

The permutation action is given componentwise by that of X and Y : 

-xxY (x,y) = (tt -X x,TT -Y y) (tt € S'a)- 

Each pair (x,y) G X x Y does indeed have finite support, namely suppxix) U 
suppyiy)- 

A. 2 Exponentials in FM-Set 

Given any function (not necessarily an equivariant one) / : X — > Y between 
FM-sets X and Y, we can make permutations of A act on / by defining 

TT -X^Y f = XX G X.(7T -Y f(7T~^ -X X)) . (40) 

By applying the property (39) with -x^y in place of -x , it makes sense to ask 
whether / has finite support. The subset of all functions from A to F which do 
have finite support in this sense, together with the action ■x—^Y given by (40), 
forms an FM-set which is the exponential X —> Y in FM-Set. Note that (40) 
implies that for all tt G Sa, f G (X ^ Y) and x G X 

TT -Y fix) = (tt /)(7T -X x) 

and hence evaluation (/, a;) i— s- f{x) determines an equivariant function ev : 
(A — > F) X A — > F. Given any f : Z x X — > F in FM-Set, its exponential 
transpose cur{f) : Z — > (A ^ F) is given by the usual ‘curried’ version of /, 
for the following reason: given any z G Z, if tt G fixes each semantic atom 
in supp^iz), then it is not hard to see from definition (40) and the equivariance 
of / that TT fixes the function Xx G X.f{z,x) as well; so this function has finite 
support and hence is in A — > F. So defining cur{f){z) to be Ax G X.f{z,x), 
we get a function cur if) : Z — > (A — > F); it is equivariant because / is and 
clearly has the property required for it to be the exponential transpose of /. 

In the rest of this appendix we indicate the structures in the category FM-Set 
used to model the key novelties of FreshML: locally fresh atoms, atom abstraction 
and concretion of atom abstractions. 
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A. 3 Locally Fresh Atoms 

The set A of atoms becomes an object of FM-Set once we endow it with the 
action: 

7T -A a = 7r(a) (tt S Sa, a S A). 

(The support of a G A is just {a}.) This FM-set is the denotation of the type 
atm of atoms in FreshML. 

The meaning of new a in exp end in FM-Set is given by the following 
proposition (whose straightforward proof we omit). It makes use of the following 
construction in the category FM-Set: given an FM-set G, define 

A'S) G = { {a, g) G A X G \ a ^ suppQ{g) } . 

This becomes an FM-set if we define a permutation action by 

TT -A0G (a, 5 ) = (7r(a),7T -G g) ■ 



Proposition A.l. Given a morphism e : A ® G — > T in FM-Set satisfying 
a ^ supprp{e{a, g)) for all (a, g) G A ® G (41) 

then there is a unique morphism new{e) : G — > T such that 

new{e){g) = e{a,g) for all g G G and a ^ suppQ^g) (42) 

is satisfied. □ 

If a typing context F has denotation G and if a ^ dom(F), then the denota- 
tion of the typing context (a : atm) ® F used in rule (11) of Fig. 2 is the FM-set 
A 0 G. Suppose the denotation of (a : atm) 0 F h exp : ty is the equivariant 
function e : A 0 G — > T. We can use Proposition A.l to give a denotation 
to F F (new a in exp end) : ty as new(e) : G — > F, using the fact that the 
freshness part of the hypothesis of (11) means that e satisfies condition (41). 
The soundness of rule (11) follows from the defining property (42) of new{e). 

A. 4 Abstraction and Concretion 

If the denotation of the FreshML type ty is the FM-set T, the denotation of the 
atom abstraction type [atm] ty is the FM-set [A]F introduced in Sect. 4 of [7]. Its 
underlying set is the quotient of the cartesian product A x T by the equivalence 
relation ~a defined as follows: {a,t) ~a (a^^0 holds if and only if 

(a" a)-Tt= (a" a') -t t' 

holds for some (or equivalently, any) a" not in the support of a, a' , t, or t' (where 
{a" a) G Sa denotes the permutation interchanging a" and a). We write a.t for 
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the ~A-equivalence class determined by {a,t)7 The action of a permutation 
7T e S'a on elements of [A]T is given by: 

TT -[A]T {a.t) = (7r(a)).(7r -t t) . 

This does indeed determine a well-defined FM-set, with the support of a.t being 
the finite set supprp(t) \ {«}• 

If a is associated with the semantic atom a € A (in some given value envi- 
ronment) and the denotation of exp : tp is t € T, then the denotation of the 
atom abstraction expression a. exp is a.t € [A]T. 

On the other hand, the meaning of concretion expressions (Sect. 4) uses the 
following property of FM-sets. 

Proposition A. 2. Given an FM-set T , for each atom- abstraction e S [A]T and 
semantic atom a € A, if a ^ smpP[a]t(c) then there is a unique element e@a 
of T such that e = a.(e@a). 

When e = a' .t then a ^ SMppmjn(e) if and only if a = a' or a ^ suppj^if); 
and in this case {a' ,t) ~a (a, (a a) -t t). So when a = a' or a ^ supprp{t), it 
follows from the uniqueness part of the defining property of e@a that 

{a' .t)@a = {a a') -T t (43) 

holds. □ 

The partial function e, a i-^ e@a is used to give the denotational semantics of 
concretion expressions, exp® a. The soundness of rule (13) follows from the easily 
verified fact that supprp(e@a) C sMpp[^]j,(e) U {a}. 
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